How to Set Up a Centralized Log Server with rsyslog

For many years, we've been running an ELK (Elasticsearch, Logstash, Kibana) stack for centralized logging. We have a specific project that requires on-premise infrastructure, so sending logs off-site to a hosted solution was not an option. Over time, however, the maintenance requirements of this self-maintained ELK stack were staggering. Filebeat, for example, filled up all the disks on all the servers in a matter of hours, not once, but twice (and for different reasons) when it could not reach its Logstash/Elasticsearch endpoint. Metricbeat suffered from a similar issue: It used far too much disk space relative to the value provided in its Elasticsearch indices. And while provisioning a self-hosted ELK stack has gotten easier over the years, it's still a lengthy process, which requires extra care anytime an upgrade is needed. Are these problems solvable? Yes. But for our needs, a simpler solution was needed.
Enter rsyslog. rsyslog has been around since 2004. It's an alternative to syslog and syslog-ng. It's fast. And relative to an ELK stack, its RAM and CPU requirements are negligible.
This idea started as a proof-of-concept, and quickly turned into a production-ready centralized logging service. Our goals are as follows:
- Set up a single VM to serve as a centralized log aggregator. We want the simplest possible solution, so we're going to combine all logs for each environment into a single log file, relying on the source IP address, hostname, log facility, and tag in each log line to differentiate where logs are coming from. Then, we can use tail, grep, and other command-line tools to watch or search those files, like we might have through the Kibana web interface previously.
- On every other server in our cluster, we'll also use rsyslog to read and forward logs from the log files created by our application. In other words, we want an rsyslog configuration to mimic how Filebeat worked for us previously (or how the AWS CloudWatch Logs agent works, if you're using AWS).
Disclaimer: Throughout this post, we'll show you how to install and configure rsyslog manually, but you'll probably want to automate that with your configuration management tool of choice (Ansible, Salt, Chef, Puppet, etc.).
Log Aggregator Setup
On a central logging server, first install rsyslog and its relp
module
(for lossless log
sending/receiving):
sudo apt install rsyslog rsyslog-relp
As of 2019, rsyslog is the default logger on current Debian and Ubuntu
releases, but rsyslog-relp
is not installed by default. We've
included both for clarity.
Now, we need to create a minimal rsyslog configuration to receive logs
and write them to one or more files. Let's create a file at
/etc/rsyslog.d/00-log-aggregator.conf
, with the following content:
module(load="imrelp")
ruleset(name="receive_from_12514") {
action(type="omfile" file="/data/logs/production.log")
}
input(type="imrelp" port="12514" ruleset="receive_from_12514")
If needed, we can listen on one or more additional ports, and write
those logs to a different file by appending new ruleset
and input
settings in our config file:
ruleset(name="receive_from_12515") {
action(type="omfile" file="/data/logs/staging.log")
}
input(type="imrelp" port="12515" ruleset="receive_from_12515")
Rotating Logs
You'll probably want to rotate these logs from time to time as well.
You can do that with a simple logrotate config. Create a new file
/etc/logrotate.d/rsyslog_aggregator
with the following content:
/data/logs/*.log {
rotate 365
daily
compress
missingok
notifempty
dateext
dateformat .%Y-%m-%d
dateyesterday
postrotate
/usr/lib/rsyslog/rsyslog-rotate
endscript
}
This configuration will rotate log files daily, compressing older files, and rename the rotated files with the applicable date.
To see what this logrotate configuration will do (without actually doing
anything, you can run it with the --debug
option:
logrotate --debug /etc/logrotate.d/rsyslog_aggregator
To customize this configuration further, look at the logrotate man
page
(or type man logrotate
on your UNIX-like operating system of choice).
Sending Logs to Our Central Server
We can also use rsyslog to send logs to our central server, with the
help of the imfile
module. First, we'll need the same packages
installed on the server:
sudo apt install rsyslog rsyslog-relp
Create a file /etc/rsyslog.d/90-log-forwarder.conf
with the following
content:
# Poll each file every 2 seconds
module(load="imfile" PollingInterval="2")
# Create a ruleset to send logs to the right port for our environment
module(load="omrelp")
ruleset(name="send_to_remote") {
action(type="omrelp" target="syslog" port="12514") # production
}
# Send all files on this server to the same remote, tagged appropriately
input(
type="imfile"
File="/home/myapp/logs/myapp_django.log"
Tag="myapp_django:"
Facility="local7"
Ruleset="send_to_remote"
)
input(
type="imfile"
File="/home/myapp/logs/myapp_celery.log"
Tag="myapp_celery:"
Facility="local7"
Ruleset="send_to_remote"
)
Again, I listed a few example log files and tags here, but you may wish
to create this file with a configuration management tool that allows you
to templatize it (and create each input()
in a Jinja2 {% for %}
loop, for example).
Be sure to restart rsyslog (i.e., sudo service rsyslog restart
) any
time you change this configuration file, and inspect /var/log/syslog
carefully for any errors reading and/or sending your log files.
Watching & Searching Logs
Since we've given up our fancy Kibana web interface, we need to search
logs through the command line now. Thankfully, that's fairly easy with
the help of tail
, grep
, and zgrep
.
To watch logs come through as they happen, just type:
tail -f /data/logs/staging.log
You can also pipe that into grep, to narrow down the logs you're watching to a specific host or tag, for example:
tail -f /data/logs/staging.log | grep django_celery
If you want to search previous log entries from today, you can do that with grep, too:
grep myapp_django /data/logs/staging.log
If you want to search the logs for a few specific days, you can do that
with zgrep
:
zgrep myapp_celery /data/logs/staging.log.2019-05-{23,24,25}.gz
Of course, you could search all logs from all time with the same method, but that might take awhile:
zgrep myapp_django /data/logs/staging.log.*.gz
Conclusion
There are a myriad of ways to configure rsyslog (and centralized logging generally), often with little documentation about how best to do so. Hopefully this helps you consolidate logs with minimal resource overhead. Feel free to comment below with feedback, questions, or the results of your tests with this method.