How to check if the remote rsyslog client is running

remotersyslog

At the moment I'm implementing the part of the monitoring system that is build around the rsyslog and auditd. I would like to include into the project a correct verification of the fact that remote rsyslog client facility is running. This check should be repeated on the rsyslog server at short time intervals.

The simpliest way to do that is to trace the timestamps of the latest log file modifications. If the log file (which one is used by rsyslog server to redirect incoming messages) have not been modified recently, we can conclude that the remote rsyslog went down and sending no more messages. But I doubt the correctness of this method.

Best Answer

That's not a bad idea. To it improve a little, perhaps each of the clients could be configured to send a message at a regular interval (providing a "heartbeat" or a "mark"). With Rsyslog, the directive is $ActionWriteAllMarkMessages [on|off]. But heed the man page:

Note that this option auto-resets to "off", so if you intend to use it with multiple actions, it must be specified in front off all selector lines that should provide this functionality.

What if there is an undiscovered configuration problem on the Syslog server that causes messages to be directed to an unexpected location, such as the file you monitor to show that the client is alive? Perhaps a more rigorous test might be to (grep/awk) for the "heartbeat" or "mark" messages provided by $ActionWriteAllMarkMessages, looking for the time of the message from a specific host.

To verify that the remote Syslog server is running, you could use netcat (nc) with the -z and -u switches. From the manual:

-u Use UDP instead of the default option of TCP.

-z Specifies that nc should just scan for listening daemons, without sending any data to them. [...]

For example, with a five-second timeout (-w5):

#!/usr/bin/env bash
hostname="<FQDN or IP Address>"
port="514"

if (nc -z -u -w5 "$hostname" "$port" > /dev/null 2>&1); then
    echo "Syslog is up."
else
    echo "Syslog could not be reached."
fi

unset hostname
unset port

Or, if the Syslog server also uses TCP, then you might omit the -u switch to take advantage of the reliability of TCP, if only for this purpose. This checks that the Syslog server is listening and available from the location of the server performing the connectivity check. There's another facet that has equal importance: disk space. (If the Syslog server runs out of space to write messages, then the Syslog server may as well be considered down.)

Considering that UDP can be "unreliable" by design, counting on the "mark" or "heartbeat" message might not always work in a congested network or Syslog server. Another approach might be to install a script on each client. The script (BASH, Python, or whatever works) could return whichever error code you devise (e.g.: Syslog process not running? return 1 or try to start the Syslog process first, or devise other tests like connectivity tests, et al). Use xinetd to start the script from /etc/xinetd.d/script_name (on RHEL/CentOS):

service check_syslog
{
    type        = UNLISTED
    port        = 6777
    socket_type = stream
    protocol    = tcp
    wait        = no
    user        = root
    server      = /usr/local/sbin/script_name
    only_from   = 127.0.0.1 10.0.0.110
    disable     = no
}

On RHEL the restart command is service xinetd restart. Edit /etc/services to give a name to port 6777. I like to use iptables to augment the only_from stanza. By the way, the port, 6777, was used just for illustration.

Related Question