How to backup an encrypted Core Storage volume off-site

backupcore-storageencryption

Given a hard disk with an encrypted Core Storage volume (but not the decryption password, because the backup service should not have access to that), how would one go about backing it up in a way that allows for pushing it to a cloud storage provider (like Amazon S3) and for incremental backups in the future (because you don't want to to push a full 1TB every day when only a couple of blocks have changed)?

Best Answer

Proposed solution:

You have an Amazon EC2 instance, with elastic block store large enough to hold the whole of the image you intend to backup:

backup-host.yourdomain.com:
/mnt/EBS/my-desktop-backup/coreimage.dmg
/mnt/EBS/my-laptop-backup/coreimage.dmg

Where:

/dev/ebs-disk-001 -> /mnt/EBS/my-desktop-backup
/dev/ebs-disk-002 -> /mnt/EBS/my-laptop-backup
etc.
etc.

or

backup-host.yourdomain.com:
/mnt/EBS/my-desktop-backup_coreimage.dmg
/mnt/EBS/my-laptop-backup_coreimage.dmg

Where:

/dev/ebs-disk-001 -> /mnt/EBS

Your initial backup would take a long while to sync, but if you employ rsync to sync, then you can eventually have the remote image catch up to your local image's changes.

Once it is caught up, you can then initiate an EBS snapshot on Amazon's side for the EBS volume containing your encrypted image.

Rinse and repeat for each backup period + snapshot you want to have backed up to the remote server, taking the following items/requirements into account:

  • The Encrypted Image needs to be unmounted.
  • The remote image copy needs to be 100% synced with the unmounted local image.
  • The snapshot needs to be done with the remote EBS volume sync'd, filesystem buffers flushed, and with no changes pending.

With this, you will be able to do the incremental backups using Amazon's cloud technology.

S3 has some serious limitations, which would not suit your needs, for this particular purpose.

The EC2 instance, if fully backed by EBS, can be shut down when you are not doing a remote sync. Ie, when your backup kicks off, you can have it fire up the instance via Amazon's EC2 API, and get the dynamic name or IP address. Once it confirms it is up, it can kick off the rsync backup. When done, it can shutdown the remote image and initiate an Amazon EBS volume snapshot action.

Edit:

rsync does chunk/block level diffs for larger files. You can specify the size of the block diff:

--block-size=SIZE

You can also specify the data stream being sent to the remote server to be compressed, saving you on traffic.

Caveats about S3 vs EBS:

Unless the solution you employ supports splitting the single large file into segments and sending them in parallel, Amazon S3 throttles the data being sent down to under 400KB/sec after a certain size.

I employ rsync differential backups on my servers to S3 as compressed tarballs. Even at tarball sizes of about 500MB, S3 will throttle. In order to work around this, you need to split the file you are sending up into parts, otherwise, the backup to S3 will take forever.

Whereas an EC2 instance with EBS volumes will be faster and not require the need to split files, simplifying backup and restoration.