MySQL chain replication timing of writes on first slave

master-slave-replicationMySQLreplication

With a MySQL chain replication setup, e.g. M1->S1->S2, what exactly is the order of operations for writing to the first slave? Is the binary log from the master simply streamed directly through to the second slave? Or are database updates to the first slave committed before replicating those updates to the second slave?

The documentation for --log-slave-updates states This option causes the slave to write the updates performed by its SQL thread to its own binary log. It's not clear to me if this is done in parallel or only after data is written. I'm only interested in row-based replication, if that matters.

I ask because I have the unusual situation of a custom application acting as a second slave and operating on events that come through the binary log. It runs queries on the first slave. If the log events were to get to the application before writes on the first slave are committed, it will be operating on stale data.

Best Answer

To answer this, you need to understand the way MySQL replication works and the difference between the IO_thread and the SQL_thread.

In a nutshell, the IO_thread pulls events from the master and stores in the relay_log. The SQL_thread pulls from the relay_log and applies them to the slave dataset.

So, the description of --log-slave-updates (emphasis mine):

This option causes the slave to write the updates performed by its SQL thread to its own binary log

means that the MySQL is only logging to the first slave's binary log after it is being committed on that slave. And it won't be available to the second slave until it has been committed on the first slave.

So if I'm understanding you right, you should be safe in your us-ecase.


Single-threaded sql thread and slave lag

It is important to note that by default, the SQL thread is single-threaded. This means even though your master can apply changes across multiple connections very quickly, when it is applied to the first slave it is one change at a time. This can contribute to slave lag.

Luckily, in MySQL 5.6 the ability to have multiple slave thread appliers was introduced with the --slave-parallel-workers option.

However, this only helped if you had multiple schemas(databases), because MySQL would only apply in parallel if statements were in different schemas.

This limitation was removed in 5.7 to allow for parallel application within the same schema.

The main point I am trying to make, is that if you are dealing with slave lag on the first slave, the second slave will take similar time (assuming similar hardware) so will be even further behind your master. So ensure your custom application takes that into account.