Mysql – the proper way to backup MySQL database with rsnapshot

backupMySQLsnapshot

I want to backup my website that contains both user uploaded images and data stored in MySQL database such that they are always consistent with each other in any particular point of time. While searching for solutions, I found this application rsnapshot which might be appropriate for the task. Having gone through a few blogs on the internet, I realize that this is actually done with a backup_script using mysqldump method:

backup_script   ssh root@example.com "mysqldump -A > /var/db/dump/mysql.sql"    unused2
backup  root@example.com:/var/db/dump/  example.com/

What I am not sure is whether there is a necessity to flush all tables with read lock before performing the dump in the script to ensure consistency of the database. If so, how should this be incorporated? If no, why is it not necessary?

Best Answer

By default, your usage of mysqldump will lock each table as it dumps it and there will not be any consistency among tables. You will also not get any triggers, stored procedures, stored functions, or events backed up, and if any of your view definitions are invalid (referencing nonexistent tables or columns that you may have dropped or renamed since the view was defined), the dump process will terminate when that error is encountered.

If all of your tables are InnoDB, you can get a consistent snapshot by adding this:

--single-transaction 

This option disables the default --lock-tables which locks each table as it is dumped, then unlocks it when done dumping it, and instead issues the following statements to the server at the beginning of the dump (which you can observe by enabling the general query log):

SET SESSION TRANSACTION ISOLATION LEVEL REPEATABLE READ;
START TRANSACTION WITH CONSISTENT SNAPSHOT;

Of course, this only helps with InnoDB. If your tables are MyISAM or any other non-transactional engine, you only have one option:

--lock-all-tables

...which locks all of the tables for the entire dump, since that's the only way to ensure consistency without transactions to assist.

To back up your stored procedures, events, and triggers, add:

--routines --events --triggers

To prevent mysqldump from failing when there's an invalid view definition

--force

To capture your binlog position and store it in the backup file, add:

--master-data=2

This option doesn't work if your machine isn't a slave.


Your question was about backing up the database "with rsnapshot" although to put a fine point on it, you're proposing backing up the database with mysqldump and then backing up the dumpfile with rsnapshot... It's hard to tell from a brief look at their site whether that's going to be a good choice, because rsnapshot appears to have file-level granularity. It doesn't appear to store diffs, but instead stores full files for "each" file that changes between backup cycles:

"The amount of space required is roughly the size of one full backup, plus a copy of each additional file that is changed." -- http://www.rsnapshot.org/rsnapshot.html

I welcome correction if I'm wrong on this point. In this case, you only have 1 file, your dump file from mysqldump, which will of course change every time.

Whatever you do, don't entertain the thought of trying to back up the actual files (.frm, .ibd, .MYD, etc.) underlying your tables in MySQL. That does not work with the server running even though under rare conditions you might get a backup that seems to be intact. It isn't.


Update:

There's an important distinction between what mylvmbackup is doing and "backing up the actual files." To use mylvmbackup, you have to be using LVM, which can be visualized as a layer of abstraction between your filesystem and your hard drive. With LVM, you have the ability to freeze the file system and make a point-in-time snapshot of the entire filesystem.

It looks like mylvmbackup is doing a FLUSH TABLES followed by a FLUSH TABLES WITH READ LOCK prior to looking up the replication coordinates (if specified in configuration), which is the same process mysqldump does when --single-transaction and --master-data are both used. This stablizes MySQL but it does not fully quiesce InnoDB, so when a backup like this is restored, InnoDB will think it crashed and will do crash recovery... but it should be a clean recovery.

If your web site's data files were in the same filesystem, then an LVM snapshot would contain everything consistent to a single point in time, but whether having your other web site files in the same filesystem as the database is good practice (if, indeed running your web server on the same server is good practice) is another discussion.

Whatever the approach, it's vitally important in any backup strategy to periodically verify that your backups can actually be restored (on a different machine). The truly safest method would be to be sure innodb_fast_shutdown = 0 and actually shut down MySQL but that's probably not practical.

Of course, if you want with something like rsnapshot, you could always snapshot the web site files, then snapshot the database, then snapshot the web site files again. If nothing changes in the site files while the database is being backed up, rsnapshot will use very little space for that second run. If things do change, it should be fairly trivial to understand the differences if you needed to recover.