Linux Syslog – Learning About General Logging and Logrotation on Linux

linuxlogrotatesyslog

Assume, that beside the Apache web server logs I never had any contact with any kind of (professional) logs on any operation system. So Logging, although I understand some basics, is all together a pretty new topic. At the moment the investment to fully learn about this topic seems to be quite huge, yet I don't even know yet, if it is even worth knowing more then the most abstract concepts.

Which resources would you suggest should someone in that situation consume (tutorials, man pages, books) to learn about Logging?

Which logs should a normal Linux user read on a daily/monthly basis? Is the assumption even correct that they are written for human readability or are they generally evaluated and used by other tools?

What should the normal *nix user and software developer know about these logs?

What do you need to know about log rotation, if you are not expected to manage professional web servers with huge loads of events?

Best Answer

[This was written a few years before the widespread adoption of journald on systemd systems and does not touch on it. Currently (late 2018) both journald and (r)syslog, described below, are used on distros such as Debian. On others, you may have to install rsyslog if you want to use it alongside, but the integration with journald is straightforward.]

I won't discuss logging with regard to ubuntu specifically much, since the topic is standardized for linux in general (and I believe most or all of what I have to say is also true in general for any flavor *nix, but don't take my word for that). I also won't say much about "how to read logs" beyond answering this question:

Is the assumption even correct that they are written for human readability or are they generally evaluated and used by other tools?

I guess that depends on the application, but in general, at least with regard to what goes into syslog (see below), they should be human readable. "Meaningful to me" is another issue, lol. However, they may be also be structured in a way that makes parsing them with standard tools (grep, awk, etc) for specific purposes easier.

Anywho, first, there is a distinction between applications which do their own logging and applications which use the system logger. Apache by default is the former, although it can be configured to do the later (which I think most people would consider undesirable). Applications which do their own logging could do so in any manner using any location for the file(s), so there is not much to say about that. The system logger is generally referred to as syslog.

syslog

"Syslog" is really a standard that is implemented with a daemon process generically called syslogd (d is for daemon!). The predominant syslog daemon currently in use on linux, including ubuntu, is rsyslogd. Rsyslogd can do a lot, but as configured out of the box on most distros it emulates a traditional syslog, which sorts stuff into plain text files in /var/log. You might find documentation for it in /usr/share/doc/rsyslog-doc-[version] (beware, there is also a /usr/share/doc/rsyslog-[version], but that's just notices from the source package such as NEWS and ChangeLog). If it's there, it's html, but Stack Exchange doesn't permit embedding local file links:

file://usr/share/doc/rsyslog-doc/index.html

So you could try copy pasting that. If it's not there, it may be part of a separate package that is not installed. Query your packaging system (eg, apt-cache search rsyslog | grep doc).

The configuration is in /etc/rsyslog.conf, which has a manual page, man rsyslog.conf, although while the manual page makes a fine reference, it may be less penetrable as an introduction. Fortunately, the fundamentals of the stock rsyslog.conf conform to those of the traditional syslog.conf, for which there are many introductions and tutorials around. This one, for example; what you want to take away from that, while peering at your local rsyslog.conf, is an understanding of facilities and priorities ("priority" is sometimes referred to as loglevel), since these are part of the aforementioned syslog standard. The reason this standard is important is because rsyslog actually gets its stuff via the kernel, and what the kernel implements is the standard.

With regard to the $ directives in rsyslog.conf, these are rsyslog specific and if you install that optional doc package you'll find a guide to them in rsyslog_conf_global.html.

Have fun...if you are curious about how applications use the system logger, look at man logger and man 3 syslog.

Log Rotation

The normative means of rotating logs is via a tool called logrotate (and there is a man logrotate). The normative method of using logrotate is via the cron daemon, although it does not have to be done that way (e.g., if you tend to turn your desktop off everyday, you might as well just do it once at boot before syslog starts but, obviously, after the filesystem is mounted rw).

There's a good introduction to logrotate here. Note that logrotate is not just for syslog stuff, it can be used with any file at all. The base configuration file is /etc/logrotate.conf, but since the configuration has an "include" directive, commonly most stuff goes into individual files in the /etc/logrotate.d directory (here d is for directory, not daemon; logrotate is not a daemon).

An important thing to consider when using logrotate is how an application will re-act when its log file gets "rotated" -- in other words, moved -- while the application is running. WRT (r)syslogd, it will just stop writing to that log (I think there is a security justification for this). The usual way to deal with that is to tell syslog to restart (and re-open all its files), which is why you will see a postrotate directive in logrotate conf files sending SIGHUP to the syslog daemon.

Debugging the issue

Put this line in a shell script, logrotate.sh:

#!/bin/bash
/usr/sbin/logrotate -f -v /etc/logrotate.d/mail3-logs &>> /var/log/logrotate/rotate.log

Make it executable and run it like this from the cron:

03 00 * * * root strace -s 2000 -o /tmp/strace.log /path/to/logrotate.bash

In going through the output you should see what is getting tripped up by the permissions problems.

EDIT #1

After conversing with the OP he mentioned that the above debugging technique uncovered that SELinux was enabled. He was perplexed as to why this was the case since he had previously disabled it with the command setenforce 0.

Disabling SELinux in this fashion will only remain in this state until the next reboot. The default mode for SELinux is dictated by this file on Fedora/CentOS:

$ cat /etc/sysconfig/selinux
# This file controls the state of SELinux on the system.
# SELINUX= can take one of these three values:
#   enforcing - SELinux security policy is enforced.
#   permissive - SELinux prints warnings instead of enforcing.
#   disabled - SELinux is fully disabled.
SELINUX=disabled
# SELINUXTYPE= type of policy in use. Possible values are:
#   targeted - Only targeted network daemons are protected.
#   strict - Full SELinux protection.
SELINUXTYPE=targeted

To permanently disable SELinux you'll want to change the line SELINUX=.. to one of the 3 states, enforcing, permissive, disabled.

I would encourage you however to take the time to understand why SELinux is disallowing the access to the directory these log files are within, and add the appropriate context's so that SELinux allows this access. SELinux is an important part of the layered security model that is facilitated on Linux distros that make use of it, and blindly disabling it is taking one of the critical layers away.

References

linux – Understanding Logging in Linux

Simplified, it goes more or less like this:

The kernel logs messages (using the printk() function) to a ring buffer in kernel space. These messages are made available to user-space applications in two ways: via the /proc/kmsg file (provided that /proc is mounted), and via the sys_syslog syscall.

There are two main applications that read (and, to some extent, can control) the kernel's ring buffer: dmesg(1) and klogd(8). The former is intended to be run on demand by users, to print the contents of the ring buffer. The latter is a daemon that reads the messages from /proc/kmsg (or calls sys_syslog, if /proc is not mounted) and sends them to syslogd(8), or to the console. That covers the kernel side.

In user space, there's syslogd(8). This is a daemon that listens on a number of UNIX domain sockets (mainly /dev/log, but others can be configured too), and optionally to the UDP port 514 for messages. It also receives messages from klogd(8) (syslogd(8) doesn't care about /proc/kmsg). It then writes these messages to some files in /log, or to named pipes, or sends them to some remote hosts (via the syslog protocol, on UDP port 514), as configured in /etc/syslog.conf.

User-space applications normally use the libc function syslog(3) to log messages. libc sends these messages to the UNIX domain socket /dev/log (where they are read by syslogd(8)), but if an application is chroot(2)-ed the messages might end up being written to other sockets, f.i. to /var/named/dev/log. It is, of course, essential for the applications sending these logs and syslogd(8) to agree on the location of these sockets. For these reason syslogd(8) can be configured to listen to additional sockets aside from the standard /dev/log.

Finally, the syslog protocol is just a datagram protocol. Nothing stops an application from sending syslog datagrams to any UNIX domain socket (provided that its credentials allows it to open the socket), bypassing the syslog(3) function in libc completely. If the datagrams are correctly formatted syslogd(8) can use them as if the messages were sent through syslog(3).

Of course, the above covers only the "classic" logging theory. Other daemons (such as rsyslog and syslog-ng, as you mention) can replace the plain syslogd(8), and do all sorts of nifty things, like send messages to remote hosts via encrypted TCP connections, provide high resolution timestamps, and so on. And there's also systemd, that is slowly phagocytosing the UNIX part of Linux. systemd has its own logging mechanisms, but that story would have to be told by somebody else. :)

Differences with the *BSD world:

On *BSD there is no klogd(8), and /proc either doesn't exist (on OpenBSD) or is mostly obsolete (on FreeBSD and NetBSD). syslogd(8) reads kernel messages from the character device /dev/klog, and dmesg(1) uses /dev/kmem to decode kernel names. Only OpenBSD has a /dev/log. FreeBSD uses two UNIX domain sockets /var/run/log and var/rub/logpriv instead, and NetBSD has a /var/run/log.