MacOS – How to re-span a cron-like launchd script in case of script error

backuperrorlaunchdmacosscript

I have a cron-like launchd script (StartCalendarInterval) that does a backup of some website data once per day:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
    <key>Label</key>
    <string>com.example.backup</string>
    <key>ProgramArguments</key>
    <array>
        <string>/Volumes/Example/backup.sh</string>
    </array>
    <key>StartCalendarInterval</key>
    <array>
        <dict>
            <key>Hour</key>
            <integer>2</integer>
            <key>Minute</key>
            <integer>15</integer>
        </dict>
    </array>
    <key>StandardErrorPath</key>
    <string>/var/log/com.example/backup_error</string>
    <key>StandardOutPath</key>
    <string>/var/log/com.example/backup_output</string>
</dict>
</plist>

In rare cases it might go wrong when the Internet is not available. The backup.sh script then sets a proper error code that is larger than 0.

Now I would like that the script is automatically relaunched an hour later after an error. And again and again until there is no error. But not after 24 hours to avoid two instances of the script running at the same time.

I believe this must be possible with ThrottleInterval and SuccessfulExit. My problem is that SuccessfulExit is linked to KeepAlive. I do not want the script to run all the time, but just once a day via the StartCalendarInterval.

Is my task doable directly with launchd? Or should I simply add wait 1 hour and try again after error to my script? The script would require resources all the time when set up like this. I would like to avoid this.

Best Answer

It seems it can be done partially. Basically the problem was that KeepAlive in combination with SuccessfulExit implied a RunAtLoad (the program was launched right at the start and not at the specified StartCalendarIntervall.) Setting the additional parameter AfterInitialDemand (which is undocumented) will change this behaviour and the program is first launched at the specified calendar time:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
    <key>Label</key>
    <string>com.example.backup</string>
    <key>ProgramArguments</key>
    <array>
        <string>/Volumes/Example/backup.sh</string>
    </array>

    <key>StartCalendarInterval</key>
    <array>
        <dict>
            <key>Hour</key>
            <integer>2</integer>
            <key>Minute</key>
            <integer>15</integer>
        </dict>
    </array>

    <key>KeepAlive</key>
    <dict>
        <key>SuccessfulExit</key>
        <false/>
        <key>AfterInitialDemand</key>
        <true/>
    </dict>
    <key>ThrottleInterval</key>
    <integer>3600</integer>

    <key>StandardErrorPath</key>
    <string>/var/log/com.example/backup_error</string>
    <key>StandardOutPath</key>
    <string>/var/log/com.example/backup_output</string>
</dict>
</plist>

The only problem is that if the program/ script fails then ThrottleInterval will overrule the StartCalendarInterval, so depending on time of error and the set interval the program/ script would not launch exactly at the specified calendar time after a day has elapsed (assuming it failed until then) and might continue at odd time offsets (until it succeeds.)

But to sum it up: the trick is the undocumented AfterInitialDemand set to true.