Mysql – Speed up slave replication on large database / quickly spawn slave

MySQLmysql-5.1replication

We have a large MySQL database (about 100GB on disk) that we would like to spawn a slave for.

Our typical process for creating a slave is:

  1. Create a slave server
  2. Load in a mysql dump of the master
  3. Start the slave
  4. Wait for slave replication to sync w/ master
  5. Place slave server in production

The problem we are having is with the size of the database. By the time step two above is completed, the slave is so far behind the master that it is unable to catch up. In fact, the slave replication grows further and further behind. I'm not sure why this happens.

Any ideas on how to either seed the slave server faster, or fix the replication issue? Note that the master server has multiple databases, and a mixture of InnoDB/MyISAM tables. The server is running MySQL 5.1

Best Answer

There are two problems here, which must be solved independently.

Creating a slave

With that size, 100GB, mysqldump is usually to slow too be done efficiently. Try using a binary backup. You have several options: @paul is telling you one, but it has the inconvenient that the master will be locked for the duration of the copy process. Additionally, rsync can be very efficient if you have multiple small files, but it may not be for larger file sizes changed randomly throughout it (if you have a large ibdata1).

My recommendations would be snapshoting (if you are using virtual machines or a filesystem that allow it: ZFS, any other on top of LVM, etc.) or Percona Xtrabackup/Oracle Enterprise Backup. These options will make the backup process almost as fast as copying files from the filesystem without almost no lock at all. Some of them also allow for parallel copy, if your bandwidth allows it.

If none of these work for you, try using a logic parallel backup/restore utility like mydumper.

Replication with increasing lag

You must discover why that is happening first, (profile your queries on both servers) but these are some of the most common causes:

  • Use hardware/resources at least as good or better that the master for your slave. If the slave is slower, as replication runs in a single thread (mostly), the slave will lag.
  • If you have long-running transactions in concurrency, try using the binlog_format = ROW. It can augment your bandwidth usage but reduce the slave's load.
  • Try upgrading the MySQL versions, if that is a possibility. There has been a lot of performance improvements in terms of query execution and binary log in later versions. For example, one reason why you may be lagging is because the buffers are not hot on the recently-started slave. That is mitigated partially in the latest versions. Also, multiple-thread replication execution has been somewhat integrated.
  • You can relax some integrity configurations on the slave, as in the case of a crash, you can always re-import from the master in the event of the crash (innodb_flush_log_at_trx_commit, for example).
  • As a last resort, you can try alternative protocols that allow more synchronized communication between servers, but that is usually involves much more work.