Mysql – Why is restoration from binary logs so slow

binlogMySQLmysql-5.5mysqlbinlog

We have a MySQL 5.5 server with binary logging enabled and a backup script that collects the binary logs and, once a day, generates a complete database dump with mysqldump. In testing database restoration from the dump & binlogs, we found that the latter are an order of magnitude slower.

Specifically:

  • Restoring from a 92.8 MB database dump with mysql dbname < dbdump.sql took 3m 26.526s.
  • Restoring all of the binary logs from after the dump with mysqlbinlog -d dbname --start-position=107 mysql-bin.000{727..755} | mysql dbname took 38m 10.090s. The total size of the binlog files was 56.8 MB, and the size of the output from mysqlbinlog was 66.4 MB.
  • Combining the above two steps by just restoring from a (93.2 MB) dump taken after the events the binlogs cover took 2m 15.406s.

Is this time disparity normal? Can anything be done to reduce it?

Best Answer

You must understand what the binary logs are: they store all the queries (in STATEMENT format) or row changes (in ROW format) that happened since they were recorded. So, your command:

mysqlbinlog -d dbname --start-position=107 mysql-bin.000{727..755} | mysql dbname

basically executes all insert, updates & deletes that occurred to the server during those almost 30 binlogs, serially. That is why recovering from the binary logs is only useful for point in time recovery (recovering the latest changes since the last backup).

There are some things that you could do to reduce the application time, like reducing temporarily the durability and consistency constraints of the database (as you can always replay the steps) -innodb_flush_log_at_trx_commit, disable the binary log, the double write, etc.- or try using ROW binary logs only, which may increase the size on disk but make them faster to apply.

Alternatively, you could reduce the number of binlogs that you have to apply by creating more frequent full backups (the size of your full backups is not precisely big), creating real incremental or differential backups with tools like xtrabackup- or using lagged slaves for disaster recovery. This is because both mysqldump and mysqlbinlog recover data in a logical way- physical backups may be faster in some cases, specially for full recovery.

Obviouslly, there could be specific issues as well, for example, I have sometimes found that replaying LOAD DATA statements is slower than I would have thought.