Launchd notification on abnormal exit

launchdnotificationsschedule

I have a launchd job plist that runs a simple shell script that invokes rdiff-backup to backup a remote directory over SSH to my computer. The job runs every hour and it works well.

Except the other day there was a power failure* and the rdiff-backup job was interrupted. The next time launchd ran the script, rdiff-backup failed and logged it's failure to the path specified in the plist. launchd, noticing the abnormal exit code, stopped trying to run the script.

And I had no idea for six days.

Obviously I don't want a notification of the exit code every time the script finishes. What can I do to be notified only of abnormal exits?

(*) turns out my UPS battery was passing the self-test when invoked, but didn't actually have the ability to power even a minimal load for more than 3 seconds.

Best Answer

The traditional approach, e.g. with cron jobs, is to pipe standard error to a program like mail that's smart enough not to send you empty mail. The difference with launchd, as you've discovered, is that the mechanism for redirecting standard error is giving a StandardErrorPath for it to be written to, which isn't as convenient for this purpose as ending your crontab entry with | mail ….

My usual solution is to have a wrapper script check the StandardErrorPath and notify me if there's a problem. This can either be part of the same launchd job, so the checking happens before the next scheduled run, or you can have a separate job that just manages the error logs (maybe using a QueueDirectories key).

I think you could also, for example, use a named pipe as your job's StandardErrorPath, but I've never actually tried that.