Linux – Creating hard drive backup images efficiently

backupcleaningfilesystemslinux

We are in the process of pruning our directories to recuperate some disk space.

The 'algorithm' for the pruning/backup process consists of a list of directories and, for each one of them, a set of rules, e.g. 'compress *.bin', 'move *.blah', 'delete *.crap', 'leave *.important'; these rules change from directory to directory but are well known. The compressed and moved files are stored in a temporary file system, burned onto a blue ray, tested within the blue ray, and, finally, deleted from their original locations.

I am doing this in Python (basically a walk statement with a dictionary with the rules for each extension in each folder).

Do you recommend a better methodology for pruning file systems? How do you do it?

We run on Linux.

Best Answer

I had a system for dropping aged backups from an partition of backup tarballs. Each host had it's own directory. Within each directory I would define a file (e.g. 00info) that my pruner would read and run a find against. The problem it encountered was when the backups entering the directory didn't match the patterns in the file. It used bin/find primarily, like

foreach pat in $patterns; do find . -type f -name "$pat" -mtime +7 | xargs rm -f ; done

This was not great, but it was very simple. And I find that if it's simple to maintain, you'll have time to actually maintain it among normal everyday pressures.

If you're programming in python, a bash script isn't going to compare to what you're capable of. So the important thing I'd suggest is: don't feel guilty for having something no-one else uses: you've created a solution that is correct for your requirements, and you can't be more correct than that.

Is there an actual problem your script isn't solving, though? Has it become difficult to maintain the rule-set?

Related Question