I am trying to script a very simple backup strategy. Here is the general idea.
Daily – Backup the entire filesystem using rsync, overwriting the previous day's backup.
Weekly – Once a week copy the daily backup to a separate folder to keep around for a week, overwriting the previous week's backup.
Monthly – on the first of the month copy the daily backup to a montly backup folder to keep around for a month, overwriting last month's backup.
Here is the conundrum:
Every day I do the weekly backup, the weekly and daily backups will be the same, so I won't have a few day old backup.
If this day falls on the first of the month all the backups will be the same, diminishing the whole point of having multiple backups.
I am limited on space and three backups is all I have room for. I am backing up VMs and websites so I don't need long term, but I do want backups that go back a while in case an error goes unnoticed for a few days.
Anyone have some ideas to rework this strategy? So I don't have periods where all the backups are the same.
Best Answer
I would write a script that checks if a backup is more than 1,7 or 30 days old and acts accordingly. You have not said so but I assume you are using Linux (I added the linux tag to your question) and you are backing up to a remote server. The first step will be to write a little script that runs your
rsync
command and also creates a file on the remote server when the backup is finished. This will be used both to tell whether a backup is currently running and to check the backup's age (I assume you are keeping the original timestamps when you backup files, so you can't get the date from the files themselves):Rsync script (this assumes you have password-less access to the remote server):
On the local machine, set up a cron job that does daily backups:
On the remote machine, you need to run the script I give below every few hours:
The check_backup.sh script:
So, this script (
check_backup.sh
) will be run every hour on your backup server. Since it does nothing unless the backup is old enough, it's no problem to have it run so often. Now, every time a daily backup is older than 31 days, it will be copied to themonthly
directory and the contents ofmonthly
will be deleted. Similarly for weekly when the backup is more than 7 days old.I am using
diff
to compare the backups. This means that we will copydaily
toweekly
if the currentweekly
is more than a week old but only if the backup that will be copied (the currentdaily
) is not the same as the existingweekly
and similarly formonthly
. For example, if the script has just run and it has seen that the monthly backup is the same as the current weekly one, it will not overwrite the existingmonthly
. However, one week later when theweekly
will have changed, then it will copy themonthly
one.The net result of this is that at any time you should have a minimum of two different backups and usually you will have three. The worst case scenario is that something fails and you don't have a week old backup, just a month old one or, vice versa, you don't have a month old one but you do have last week's.