rsync is just a brute force byte-by-byte copy of data from place to another. You would have perform multiple rsyncs until the last rsync, which is very quick. Then, you would do a full shutdown of MySQL and perform one more rsync.
XtraBackup is a comprehensive tool that behaves like rsync but has directed purpose in life. It can start off copying all InnoDB data and tablespaces. It has the ability to create checkpoints internally and perform in-place an InnoDB crash recovery to help get perfect point-in-time backup. XtraBackup also has an extra feature which allows for the creation of incremental backups. Another additional feature is the creation of InnoDB log files, also built by an in-place crash recovery. There is also some wrapper software to provide frozen copying of MyISAM tables as well.
Both methods are great. XtraBackup simply implements InnoDB transactional features into most of the initial copying. Xtrabackup creates files that can dropped into an established MySQL infrastructure. Fair to say, XtraBackup provides instrumentational backup and materialized, very usable InnoDB files.
Using rsync forces you to manage the copy process multiple times and cap it off with a mysql shutdown to require one intervention of rsync.
Using one over the other may be a personal preference. One thing must be acknowledged: XtraBackup makes backups that are somewhat bigger that the data that was copied. I leave the choice of method to the reader of this question.
I prefer rsync because it has simplicity of use, I can determine the specific point in time before the process begins, I also have total control as to locking mysqld or shutting it down, when such control can be performed, and in whatever order I designate.
Both backups styles have one thing in common : With XtraBackup, the actual point-in-time is a moving target until the backup process is complete and you have to trust XtraBackup (hey 200,000 downloads thus far. Facebook is one of the biggest users of it. It has earned a lot of trust). In other words, if I start XtraBackup at midnight and the backup lasts till 2:20 AM, then the backup's actual point-in-time is 2:20 AM. Using rsync has the same moving target problem because you have to do multiple rsyncs manually and then determine when to issue a 'service mysql stop' prior to the final rsync.
These methods differ in that mysqld must be shutdown using the rsync method, whereas XtraBackup acts independently of mysqld. In fact, XtraBackup does not interact with mysqld very much at all, except when dealing with MyISAM tables via the wrapper software. XtraBackup tries to build fully functional InnoDB data and log files using InnoDB storage engine operations separate and distinct from mysqld.
Shutting down mysql is very necessary for the rsync method for a very big reason: The InnoDB Buffer Pool can hold "dirty pages", which are data and index pages (1 Page = 16KB) that have not been written back to disk. Using rsync does not catch the data changes pending in the InnoDB Buffer Pool. To speed up the shutdown of mysqld for the final rsync, please run this command:
SET GLOBAL innodb_max_dirty_pages_pct = 0;
This command will keep the number of dirty pages to a bare minimum. This command should be run about one hour before you perform the rsync method of backup. This might even help XtraBackup as well though XtraBackup does not require a mysql shutdown.
What replication method are you using? Option 2 is fine with ROW based, but for STATEMENT replication your auto inc values on the slave are not guaranteed to be the same.
Option 2 also means that your auto inc values probably won't be consecutive, but maybe that does not matter to you.
Option 1 is the default because it is the safest method with STATEMENT based replication (which I think is the default). It is more scalable than option 0 since it only does a table level auto inc lock for bulk inserts but it still provides consecutive auto inc values with no gaps. Some folks rely upon that behavior.
Sounds like for your use case you're fine with the setting you have selected, as long as your replication settings are OK.
Hope that helps.
Best Answer
Pattern 1:
Pattern 2:
Pattern 1 locks the row(s) longer, but allows more flexibility in what is done.
Pattern 2 has two quick locks. But, because of other threads potentially messing with the same row(s), you have to check that you really got the row.
Since you mentioned queuing, think of it this way... You have two structures: (1) the queue, and (2) your application data. In general it is better to handle them separately. The way to do that is to avoid having both in the same transaction.
Having both in the same transaction causes locks to last longer for both. And it forces rollback of both, even if that is inappropriate.
Pattern 1 is better when the
SELECT
and theUPDATE
are both working on application stuff.Something like Pattern 2 is useful with you need to hang onto something "too long". (That is, more than a few seconds. Or from one html page to another, where you have lost the connection.)