How to Backup a Remote Ubuntu VPS via SSH

backupbashremotescpssh

So, as I said I have a VPS, Raspberry Pi and a plan, but I am in need of some advice, so here goes.

I have a VPS with prgmr.com which I ahve up and running nice and smoothly without any issues. I also have a new RaspberryPi sat on my desk, with Raspberrian downloading as I type. My plan is to use the Raspberry Pi as a backup server at home, so my question is this:

How would you go about backing up a remote Ubuntu VPS via SSH?

The VPS is setup as a web server, but I would like to set up a cron job on the Pi so that it can automatically login to the VPS and run then download a backup, just incase I manage to brick the thing or something else happens that results in me losing data. Naturally I want this backup to contain everything so I can quickly restore the whole server, all nicely configured as it is now if things go bad.

How would you go about running this sort of backup? I figured that I would have to write some kind of bash script to SSH in, compress all the relevant files into a tar.gz or similar and the download the images vis SCP.

What are your thoughts on this? What packages etc would you use, and how would you configure it? The VPS has a LAMP stack on, so what files would you aim to backup? It also has lots of other smaller programs such as GIT, and ZendTools installed.

Best Answer

SSH Public Key Authentication

The first thing you want to do is start with ssh public key authentication. This will let your script use SSH without a password.

All that the server needs is SSH installed, and public key authentication set up for the user that will be running the backup script from the RasPi.

Here's a good tutorial for public key authentication: https://hkn.eecs.berkeley.edu/~dhsu/ssh_public_key_howto.html

Option 1: SSH and Tar

You can compress the tar.gz from the server and transmit it directly over ssh with something like this:

ssh root@remoteserver.example.com "tar -czvf - / 2> /var/log/sshbackup" > vpsbackup.tar.gz

This will make the VPS tar and gzip all files on / and transmit it over SSH to store in vpsbackup.tar.gz on the RasPi. A log of the most recent backup will be kept on /var/log/sshbackup on the VPS.

Option 2: Rsync

Sending an entire .tar.gz over SSH is inefficient... Files that don't change will still be transmitted. A better solution is to use rsync, but this makes it difficult to make a .tar.gz that preserves permissions. If you have enough storage space on the RasPi, you can just store the backup files as plain ol' files. Then you can have a script tar.gz them if you want to keep multiple past backups.

The server needs rsync installed. This will run over SSH, so you still use the public key authentication, and keep the encryption. You will need to run this command as root and have public key authentication and SSH logins for root enabled to preserve permissions. Your destination (or at least a temporary destination) should be a Linux filesystem. If you're storing these backups on a FAT or NTFS partition (e.g. on most external hard drives), you can make a loopback filesystem (see http://www.walkernews.net/2007/07/01/create-linux-loopback-file-system-on-disk-file/) for temporary storage. The tar.gz file can be stored on any partition, because it preserves permissions on its own.

An example rsync command:

rsync -a --delete --exclude=/dev --exclude=/sys --exclude=/proc --exclude=/tmp remoteserver.example.com:/ /path/to/backup/destination/

be careful when using --delete, especially as root! It will delete any files in the destination directory that do not exist on the backup source. You should only use --delete when syncing to a dedicated backup directory being used only for that VPS. You should also make sure there is no possibility of your script syncing to the wrong destination (e.g. if /path/to/backup/destination is determined by a shell variable)

rsync will only transfer files that are different between the source and destination. If you have large files, it will also only transfer parts of the file that have changed (for this to work, you must add the -c flag). This means you are using minimal bandwidth, but it'll use more CPU and slow down re-sync preparation times as both sides need to first checksum files to determine which blocks to transfer. If you do use the -c flag and you have large files (such as database files) and/or a flaky connection, consider adding --partial --append, which enables you to resume transfers after a connection is interrupted.

Related Question