Rsyslog filling up /var/log puts the system down

disk-usagelogsrsyslogsyslog

Sometimes a process goes wrong and the logs in /var/log grow up so much they ultimately fill up the whole partition.

It happened me once on a server due to a wrong postfix config, and once on a desktop due to a USB printer (not sure exactly what went wrong, all I know is the logs where filled with (hp) did not claim interface 1 before use).

I know the root cause is not the logger but the application. However, I can't help thinking this weak point is a shame. Especially on a desktop, where a printer filling up the whole system partition prevents the user from loading the GUI on the next run (no space on /tmp) which is a total blocker for a non-techy.

  • logrotate is not an answer because it acts daily or even weekly.

  • rsyslog has that max-size,action-on-max-size config option, but it seems non trivial and according to the doc itself, it might break in a future release.

  • Putting /var/log in a dedicated partition, however, would prevent this from happening.

To my knowledge, the separate partition for /var/log is the only solution to this. I've seen this recommended sometimes, but it is not the default in Debian installer, for instance. Should it be?

Is there any other simple way to avoid this? Some way to provide a max size to the /var/log directory, or at least to rsyslog?

Is this issue not frequent enough to justify a protection mechanism that would be activated by default? (I'm particularly thinking of home/desktop installation, for which users are not supposed to be able to deal with this themselves.)

Edit: SystemLogRateLimit

Thanks to Julie Pelletier's answer, I discovered the rate limit mechanism in rsyslog which answers precisely this need and should be activated by default.

There is input rate limiting available, (since 5.7.1) to guard you against the problems of a wild running logging process. If more than SysSock.RateLimit.Interval * SysSock.RateLimit.Burst log messages are emitted from the same process, those messages with SysSock.RateLimit.Severity or lower will be dropped.

However, SystemLogRateLimit does not work if the PID changes. The doc page explains how to test the rate limiting feature and says that

rate limiting only works if the same PID is given

I guess this is why it does not apply here.

On my desktop:

Jul 25 21:34:36 bouzin kernel: [46038.140491] usb 1-5: usbfs: process 12126 (hp) did not claim interface 1 before use
Jul 25 21:34:36 bouzin kernel: [46038.140546] usb 1-5: usbfs: process 12127 (hp) did not claim interface 1 before use
Jul 25 21:34:36 bouzin kernel: [46038.140606] usb 1-5: usbfs: process 12128 (hp) did not claim interface 1 before use
Jul 25 21:34:36 bouzin kernel: [46038.140675] usb 1-5: usbfs: process 12129 (hp) did not claim interface 1 before use
Jul 25 21:34:36 bouzin kernel: [46038.140740] usb 1-5: usbfs: process 12130 (hp) did not claim interface 1 before use
Jul 25 21:34:36 bouzin kernel: [46038.140809] usb 1-5: usbfs: process 12131 (hp) did not claim interface 1 before use
Jul 25 21:34:36 bouzin kernel: [46038.140868] usb 1-5: usbfs: process 12132 (hp) did not claim interface 1 before use
Jul 25 21:34:36 bouzin kernel: [46038.140928] usb 1-5: usbfs: process 12133 (hp) did not claim interface 1 before use
Jul 25 21:34:36 bouzin kernel: [46038.140988] usb 1-5: usbfs: process 12134 (hp) did not claim interface 1 before use
Jul 25 21:34:36 bouzin kernel: [46038.141046] usb 1-5: usbfs: process 12135 (hp) did not claim interface 1 before use

On my server:

Jul  5 13:37:45 server postfix/smtpd[426]: NOQUEUE: reject_warning: RCPT from unknown[177.11.51.178]: 451 4.3.0 <some_adress@yahoo.com.br>: Temporary lookup failure; from=<address@gmail.com> to=<some_adress@yahoo.com.br> proto=ESMTP helo=<mail.SERVER.TEST>
Jul  5 13:37:45 server postfix/smtpd[437]: NOQUEUE: reject_warning: RCPT from unknown[177.11.51.178]: 451 4.3.0 <some_adress@yahoo.com.br>: Temporary lookup failure; from=<address@gmail.com> to=<some_adress@yahoo.com.br> proto=ESMTP helo=<mail.SERVER.TEST>
Jul  5 13:37:45 server postfix/smtpd[426]: NOQUEUE: reject_warning: RCPT from unknown[177.11.51.178]: 451 4.3.0 <some_adress@yahoo.com.br>: Temporary lookup failure; from=<address@gmail.com> to=<some_adress@yahoo.com.br> proto=ESMTP helo=<mail.SERVER.TEST>
Jul  5 13:37:45 server postfix/smtpd[437]: NOQUEUE: reject_warning: RCPT from unknown[177.11.51.178]: 451 4.3.0 <some_adress@yahoo.com.br>: Temporary lookup failure; from=<address@gmail.com> to=<some_adress@yahoo.com.br> proto=ESMTP helo=<mail.SERVER.TEST>

So IIUC rsyslog's rate limiting feature is not relevant here because each log line is written by a different process.

Edit 2: Restrict directory size

I ended up limiting the size of /var/log using a virtual filesystem (as suggested here).

I might set up a separate partition next time I install a Linux.

I'm keeping this question open for now as I consider this a workaround rather than an answer.

Best Answer

rsyslog includes a rate limiting option by default through the imuxsock module.

It defaults to 200 messages per 5 seconds but can easily be changed by setting the following after the module is loaded:

$SystemLogRateLimitInterval 5
$SystemLogRateLimitBurst 200

The $SystemLogRateLimitInterval is the interval in seconds (which you should increase) and $SystemLogRateLimitBurst is the maximum number of messages allowed by the application during that interval (which you should decrease).

Update: Based on your update regarding the fact that errors are flooding your syslog with different process IDs, there is no practical way for the daemon to deal with these efficiently.

Changing the log rotation rules on maximum file size would therefore be the only possible solution. Note that once compressed (as per usual log rotation process), these large files will become minuscule because of the repetitivity of their contents.

Related Question