I have a launchd job plist that runs a simple shell script that invokes rdiff-backup to backup a remote directory over SSH to my computer. The job runs every hour and it works well.
Except the other day there was a power failure* and the rdiff-backup job was interrupted. The next time launchd ran the script, rdiff-backup failed and logged it's failure to the path specified in the plist. launchd, noticing the abnormal exit code, stopped trying to run the script.
And I had no idea for six days.
Obviously I don't want a notification of the exit code every time the script finishes. What can I do to be notified only of abnormal exits?
(*) turns out my UPS battery was passing the self-test when invoked, but didn't actually have the ability to power even a minimal load for more than 3 seconds.
Best Answer
The traditional approach, e.g. with
cron
jobs, is to pipe standard error to a program likemail
that's smart enough not to send you empty mail. The difference with launchd, as you've discovered, is that the mechanism for redirecting standard error is giving aStandardErrorPath
for it to be written to, which isn't as convenient for this purpose as ending your crontab entry with| mail …
.My usual solution is to have a wrapper script check the
StandardErrorPath
and notify me if there's a problem. This can either be part of the same launchd job, so the checking happens before the next scheduled run, or you can have a separate job that just manages the error logs (maybe using aQueueDirectories
key).I think you could also, for example, use a named pipe as your job's
StandardErrorPath
, but I've never actually tried that.