Mysql – Is it safe to run parallel innodb single-transaction dumps of individual tables

backupinnodbMySQLmysqldump

I can't seem to find the internals for –single-transaction mysqldump option for InnoDB tables, but is it possible to use –single-transaction for single tables in a parallel dump so that the –single-transaction only affects the current table I'm dumping and it is safe to run in parallel for quicker dumps? I tested this on a development server and works well but without knowing the internal locking methods I am worried to use it in production

Best Answer

While parallel --single-transaction dumps on tables are possible, there are many things you must take under consideration:

InnoDB Buffer Pool

When you mysqldump a single table, the connection launches a START TRANSACTION followed by SELECT /*!40001 SQL_NO_CACHE */ ..... Naturally, this will push all data pages for the dumped table through the InnoDB Buffer Pool. Launching parallel dumps would just make the Buffer Pool become a funnel because of passing multiple tables through it. The combined time of each table dump may not necessarily be be less that dumping all the table sequentially if the combined size of the table is greater than innodb_buffer_pool_size.

Binary Log Position

When doing parallel dumps with --master-data on a DB Server with binary logging enabled, you will quickly realize that the dump of each table is not perfectly consistent. Why ? When launching a mysqldump of everything, lines 23-25 will contain a CHANGE MASTER TO command before the dump data/schema. Each dump will have a different master_log_pos, especially coming from a live system.

Locking

You had made for following statement

I tested this on a development server and works well but without knowing the internal locking methods I am worried to use it in production

Performing a single transaction dump does not trigger any kind of locking. According to the MySQL Documentation on --single-transaction:

While a --single-transaction dump is in process, to ensure a valid dump file (correct table contents and binary log coordinates), no other connection should use the following statements: ALTER TABLE, CREATE TABLE, DROP TABLE, RENAME TABLE, TRUNCATE TABLE. A consistent read is not isolated from those statements, so use of them on a table to be dumped can cause the SELECT that is performed by mysqldump to retrieve the table contents to obtain incorrect contents or fail.

The --single-transaction option and the --lock-tables option are mutually exclusive because LOCK TABLES causes any pending transactions to be committed implicitly.

SUGGESTION

I have recommended doing parallel dumps in the past (See my old post How can I optimize a mysqldump of a large database? under the heading Option 3 : mysqldump separate tables into separate data files)

If you want to launch parallel dumps of tables in Production, it is best to setup MySQL Replication. Then on the Backup Slave, you can do the following:

  • STOP SLAVE IO_THREAD;
  • Wait until the SQL Thread stops processing (SHOW SLAVE STATUS\G shows that Exec_Master_Log_Pos stops changing)
  • Perform parallel dumps
  • START SLAVE;

GIVE IT A TRY !!!