Systemd – Correct Way for Script-Based Service to Log Messages with Priorities

linuxlogsshell-scriptsystemdsystemd-journald

I have a shell script that I run as a systemd service, and I want to log messages with granular priorities into the systemd log for that service.

When I use logger(1), journald only logs some of the messages and discards the rest. Which message ends up getting logged to the service log appears to be completely random; Sometimes only one or two messages get logged, sometimes no message gets logged at all.

At first I thought it was a boot order/dependency issue, but that doesn't seem to be the case because all messages do appear in the system log (i.e. journalctl --system) but not the service log (i.e. journalctl -u SERVICE.service). I also tried systemd-cat but its behavior was unfortunately similar.

What is the correct way for a script-based service to log messages with priorities to its own systemd log?

Best Answer

The problem that systemd has with logger is the same problem that it has with its own systemd-notify tool. The protocol is datagram-based and asynchronous, and the tools do one job. The invoker forks a process to run the tool; it fires off a datagram and forgets about it; and the process exits.

systemd server processes for the logging and readiness notification protocols want to know what service the sender belongs to, the former in order to add the right service name field to the log entry and the latter in order to know what service is being talked about. They get the process ID of the datagram sender from Linux, and then they go to the process table to look up what control groups the process belongs to, and hence what service it belongs to.

If the sending process has done its job and immediately exited, this does not work (subject to a race condition). The process is no longer in the process table. systemd-notify notifications fail; logger information does not get marked as belonging to the relevant service. Switching to a stream protocol (e.g. with logger's --tcp option) wouldn't fix this unless the logging protocol itself were also altered so that the client awaited a response from the server before closing the stream and exiting, which it does not. RFC 5426 has no server acknowledgements sent back to clients.

So although the log information is in the journal, it isn't tagged with the service name and isn't pulled out when you query by service name. (These aren't separate logs as you think, by the way. journalctl is simply applying filters to one big log. -u is a filter.)

This is a long-standing, widely-known, bug.

The systemd people describe this as a deficiency of Linux. It doesn't have proper job objects that can be used to encapsulate and track sets of processes; nor does its AF_LOCAL datagram socket mechanism transport such information. If it did, systemd could put all service processes into a job, and its logging and readiness notification servers could pull out the client-end job information whenever they receive a datagram, even if the client process had since exited.

There is an idiosyncratic protocol specific to system-journald, that some versions of logger even speak. No, _SYSTEMD_UNIT is a "trusted field" set at the server end, and attempts by clients to set it will be ignored; and that's a datagram-based, asynchronous, protocol without acknowledgements too. It has exactly the same problem.

To reliably have log entries tagged with the right service, write to standard error. That's long-lived and more reliably connectable to the service unit name at the server end. Yes, you cannot specify the old facilities and priorities; that's the trade-off that you have to make.

Further reading

Related Question