Mysql – Is MySQL Replication Affected by a High-Latency Interconnect

MySQLreplication

We've got a vanilla master and slave MySQL setup that reside in different data centers, and another slave in the same datacenter as the master.

The bandwidth between the datacenter is pretty high (in network benchmarks we've done we can reach 15MB/second), but latency exists, it is around 28ms. It's not high by any means, but it is much higher than the sub-second latency in the same datacenter.

Occasionally, we experience serious lags (2000 seconds and more) with the remove slave, while the local slave stays up to date. When looking at the lagging remote slave, the SQL thread usually spends the time waiting for the IO thread to update the relay log. The master shows "waiting for net" or something of the sort at the same time.

So it means it's network, but we still have free bandwidth at the time this happens.

My question is: can the latency between the datacenters affect the performance of replication?
Does the slave io thread just stream the events until the master stops sending them, or is it pooling the master somehow between events?

Best Answer

The direct answer to your question is Yes, but it depends on the version of MySQL you are running. Before MySQL 5.5, replication would operate as follows:

  • Master Executes SQL
  • Master Records SQL Event in its Binary Logs
  • Slave Reads SQL Event from Master Binary Logs
  • Slave Stores SQL Event in its Relay Logs via I/O Thread
  • Slave Reads Next SQL Event From Relay Log via SQL Thread
  • Slave Executes SQL
  • Slave Acknowledges Master of the Complete Execution of the SQL Event

As of MySQL 5.5, using Semisynchronous Replication, now replication would operate as follows:

  • Master Executes SQL
  • Master Records SQL Event in its Binary Logs
  • Slave Reads SQL Event from Master Binary Logs
  • Slave Acknowledges Master of the Receipt of the SQL Event
  • Slave Stores SQL Event in its Relay Logs via I/O Thread
  • Slave Reads Next SQL Event From Relay Log via SQL Thread
  • Slave Executes SQL
  • Slave Acknowledges Master of the Complete Execution of the SQL Event

This new paradigm will permit a Slave to be closer sync'd to its Master.

Notwithstanding, latency within the network could hamper MySQL Semisync Replication to the point where it reverts back to the old-style asynchronous replication. Why ? If a timeout occurs without any slave having acknowledged the transaction, the master reverts to asynchronous replication. When at least one semisynchronous slave catches up, the master returns to semisynchronous replication.

UPDATE 2011-08-08 14:22 EDT

The configuration of MySQL 5.5 Semisynchronous Replication is straightforward

Step 1) Add these four(4) lines to /etc/my.cnf

[mysqld]
plugin-dir=/usr/lib64/mysql/plugin
#rpl_semi_sync_master_enabled
#rpl_semi_sync_master_timeout=5000
#rpl_semi_sync_slave_enabled

Step 2) Restart MySQL

service mysql restart

Step 3) Run these commands in the MySQL client

INSTALL PLUGIN rpl_semi_sync_master SONAME 'semisync_master.so';
INSTALL PLUGIN rpl_semi_sync_slave  SONAME 'semisync_slave.so';

Step 4) Uncomment the three rpm_semi_sync options after the plugin-dir option

[mysqld]
plugin-dir=/usr/lib64/mysql/plugin
rpl_semi_sync_master_enabled
rpl_semi_sync_master_timeout=5000
rpl_semi_sync_slave_enabled

Step 5) Restart MySQL

service mysql restart

All Done !!! Now just setup MySQL Replication as usual.