Mysql – Speed up slave replication on large database / quickly spawn slave

MySQLmysql-5.1replication

We have a large MySQL database (about 100GB on disk) that we would like to spawn a slave for.

Our typical process for creating a slave is:

Create a slave server
Load in a mysql dump of the master
Start the slave
Wait for slave replication to sync w/ master
Place slave server in production

The problem we are having is with the size of the database. By the time step two above is completed, the slave is so far behind the master that it is unable to catch up. In fact, the slave replication grows further and further behind. I'm not sure why this happens.

Any ideas on how to either seed the slave server faster, or fix the replication issue? Note that the master server has multiple databases, and a mixture of InnoDB/MyISAM tables. The server is running MySQL 5.1

Best Answer

There are two problems here, which must be solved independently.

Creating a slave

With that size, 100GB, mysqldump is usually to slow too be done efficiently. Try using a binary backup. You have several options: @paul is telling you one, but it has the inconvenient that the master will be locked for the duration of the copy process. Additionally, rsync can be very efficient if you have multiple small files, but it may not be for larger file sizes changed randomly throughout it (if you have a large ibdata1).

My recommendations would be snapshoting (if you are using virtual machines or a filesystem that allow it: ZFS, any other on top of LVM, etc.) or Percona Xtrabackup/Oracle Enterprise Backup. These options will make the backup process almost as fast as copying files from the filesystem without almost no lock at all. Some of them also allow for parallel copy, if your bandwidth allows it.

If none of these work for you, try using a logic parallel backup/restore utility like mydumper.

Replication with increasing lag

You must discover why that is happening first, (profile your queries on both servers) but these are some of the most common causes:

Use hardware/resources at least as good or better that the master for your slave. If the slave is slower, as replication runs in a single thread (mostly), the slave will lag.
If you have long-running transactions in concurrency, try using the binlog_format = ROW. It can augment your bandwidth usage but reduce the slave's load.
Try upgrading the MySQL versions, if that is a possibility. There has been a lot of performance improvements in terms of query execution and binary log in later versions. For example, one reason why you may be lagging is because the buffers are not hot on the recently-started slave. That is mitigated partially in the latest versions. Also, multiple-thread replication execution has been somewhat integrated.
You can relax some integrity configurations on the slave, as in the case of a crash, you can always re-import from the master in the event of the crash (innodb_flush_log_at_trx_commit, for example).
As a last resort, you can try alternative protocols that allow more synchronized communication between servers, but that is usually involves much more work.

Circular Replication

Circular Replication is nothing more than first setting up Master/Slave then performing the same steps using the Slave as the Master's Master and the Master as the Slave's Slave. It just entails

explicitly user a different server_id on each DB Server
enabling binary logging on both DB servers
making sure the replication user is defined on both DB servers

Database Virtual IP (aka DBVIP)

There are products you can download and install to setup a DBVIP. One such product I use is ucarp. Another product is Linux Heartbeat. I normally do not use such things with MySQL Circular Replication or Master/Slave. Why?

Since those products can perform automatic failover, you do not want to do that in the event a Slave is some number of seconds behind in replication lag.

You should perform manual failovers.

Here is a poor man's approach to implementing DBVIP management.

Suppose you have this setup

DB Server1 has IP 10.1.2.30
DB Server2 has IP 10.1.2.40
You want to use DBVIP 10.1.2.50

Create the Script called /usr/local/sbin/MyAppDBVIP like this

echo echo 10.1.2.50 > /usr/local/sbin/MyAppDBVIP

Create the Script called /usr/local/sbin/dbvip-up

DBVIP=`/usr/local/sbin/MyAppDBVIP`
ip addr add ${DBVIP}/24 dev eth1

Create the Script called /usr/local/sbin/dbvip-down

DBVIP=`/usr/local/sbin/MyAppDBVIP`
ip addr del ${DBVIP}/24 dev eth1

Make sure all scripts are executable

chmod +x /usr/local/sbin/MyAppDBVIP
chmod +x /usr/local/sbin/dbvip-up
chmod +x /usr/local/sbin/dbvip-down

Make sure these script exist on both DB Servers

Simply run dbvip on whichever server you choose. .

So the failover process and protocol are the following:

Run dbvip-down on the DB Server that has the DBVIP. If you cannot
Run dbvip-up on the DB Server that you want to have the DBVIP
Just remember you should not run dbvip-up on both machines
After running dbvip-up, restart apache, JBoss, or any other app server contacting MySQL via the old Master

Best Answer

Related Solutions

Mysql – How to eradicate Mysql Slave Replication delays when huge updates from master

Mysql – Client to use the slave when the master is down in MySQL replication

Circular Replication

Database Virtual IP (aka DBVIP)

Related Question