Mysql – Server load discrepency in Slave with Master load being low and constant

linuxMySQLmysql-5.5replication

We have MySQL Replication

  • Master/Slave Setup
    • db01 Master
    • db02 Slave (ReadOnly)

We are getting load discrepency on the Slave (db02), but the Master's (db01) server load is low. MySQL is the only active process on the Slave.

The CPU iowait is increased.

12:00:01 AM       CPU     %user     %nice   %system   %iowait    %steal     %idle
12:10:01 AM       all      0.07      0.00      0.02      0.01      0.00     99.90
12:20:01 AM       all      0.06      0.00      0.01      0.01      0.00     99.92
12:30:01 AM       all      0.05      0.01      0.01      0.01      0.00     99.92
12:40:01 AM       all      0.06      0.00      0.01      0.01      0.00     99.92
12:50:01 AM       all      0.06      0.00      0.01      0.01      0.00     99.92
01:00:01 AM       all      0.05      0.00      0.01      0.01      0.00     99.93
01:10:01 AM       all      0.07      0.00      0.02      0.01      0.00     99.90
01:20:01 AM       all      0.06      0.00      0.01      0.01      0.00     99.92
01:30:01 AM       all      0.05      0.01      0.01      0.01      0.00     99.92
01:40:01 AM       all      0.06      0.00      0.01      0.02      0.00     99.91
01:50:01 AM       all      0.05      0.00      0.01      0.01      0.00     99.93
02:00:01 AM       all      0.05      0.00      0.01      0.01      0.00     99.93
02:10:01 AM       all      0.06      0.00      0.02      0.01      0.00     99.91
02:20:01 AM       all      0.05      0.00      0.01      0.01      0.00     99.93
02:30:01 AM       all      0.05      0.01      0.01      0.01      0.00     99.93
02:40:01 AM       all      0.06      0.00      0.01      0.01      0.00     99.92
02:50:01 AM       all      0.05      0.00      0.01      0.01      0.00     99.93
03:00:01 AM       all      0.05      0.00      0.01      0.01      0.00     99.94
03:10:01 AM       all      0.07      0.00      0.02      0.40      0.00     99.52
03:20:01 AM       all      0.05      0.00      0.01      0.59      0.00     99.35
03:30:01 AM       all      0.05      0.01      0.01      0.50      0.00     99.44
03:40:01 AM       all      0.05      0.00      0.01      0.51      0.00     99.43
03:50:01 AM       all      0.05      0.00      0.01      0.71      0.00     99.22
04:00:01 AM       all      0.05      0.00      0.01      0.47      0.00     99.47
04:10:01 AM       all      0.06      0.00      0.01      0.63      0.00     99.30
04:20:01 AM       all      0.05      0.00      0.02      0.48      0.00     99.45
04:30:01 AM       all      0.05      0.01      0.01      0.48      0.00     99.45
04:40:01 AM       all      0.05      0.00      0.01      0.54      0.00     99.39
04:50:01 AM       all      0.05      0.00      0.01      0.53      0.00     99.40
05:00:01 AM       all      0.05      0.00      0.01      0.55      0.00     99.39
05:10:01 AM       all      0.06      0.00      0.02      0.81      0.00     99.11
05:20:01 AM       all      0.05      0.00      0.01      0.67      0.00     99.27
05:30:01 AM       all      0.05      0.01      0.01      0.57      0.00     99.36
05:40:01 AM       all      0.06      0.00      0.02      0.58      0.00     99.35
05:50:01 AM       all      0.06      0.00      0.02      0.67      0.00     99.25
06:00:01 AM       all      0.05      0.00      0.01      0.62      0.00     99.31
06:10:01 AM       all      0.06      0.00      0.02      0.74      0.00     99.18
Average:          all      0.06      0.00      0.01      0.30      0.00     99.63

On the Master(db01), the load is low and constant. On the Slave(db02), load is increasing.

Questions

  • Why is this happeningit will happen we are not understood
  • How can we debug this or resolve it

Please help me out here.

Best Answer

You will have to look at this from different perspectives

Data / Storage Engine

The InnoDB Storage Engine is constructed to do as little writing as possible. As an ACID-complaint storage engine, InnoDB has mechanisms in place for concurrent manipulation of data. The structures used by the InnoDB Infrastrcuture are the following:

  • Buffer Pool : Changes to Data and Index Pages that are Cached in RAM
  • Double-Write Buffer : Backup buffer to prevent data corruption
  • Insert Buffer : Changes to secondary non-unique indexes
  • Redo Logs : Playback instructions in the event of Crash Recovery
  • Undo Logs : Reverse instructions in the event of Crash Recovery/Rollback
  • See Pictorial Representation

Master

In all likelihood, there are concurrent operations from multiple DB Connections performing

  • INSERTs
  • UPDATEs
  • DELETEs
  • SELECTs

on the Master. These changes and lookups are cached and managed with the InnoDB Infrastructure. Flushing all recorded changes is serially managed but has no immediate effect on data concurrency. The only discernible bottleneck would be the time an SQL transaction decides to write the SQL Commands into its Binary Logs. The OS is responsible for flushing changes to binary logs to disk.

Slave

Here is where things can get interesting. While the Master handles everything thrown at it, it has to write the SQL Commands into its Binary Logs. The Slave has the responsibility to process what the Master gives it.

Here is the MySQL Replication Paradigm

  • IO Thread on Slave Connects to Master
  • The Master Does the Following on a Continuous Basis:
    • Completes SQL Command
    • Records SQL Command in its Binary Logs
  • The Slave Does the Following on a Continuous Basis:
    • IO Thread from Slave Reads incoming SQL Commands from the Master's Binary Logs
    • IO Thread Records the SQL into its Relay Logs
    • SQL Thread from Slave Reads Next SQL Command to Be Executed
    • SQL Thread Executes the Latest SQL Command From the Relay Logs
    • If Binary Logging is enabled on the Slave, Slave Records SQL Command in its Binary Logs

If you carefully follow the flow of control from this Paradigm description, you can easily see that all the SQL sent from the Master is serialized. In other words, the SQL commands from the Relay Logs are executed one at a time. This can manifest itself on the Slave as follows:

  • When you run SHOW SLAVE STATUS\G, you see Seconds_Behind_Master increasing
  • When you run SHOW PROCESSLIST;, the SQL thread (whose user is 'system user') is executing an SQL statement that is taking a long time on its own.

This can easily happen when a Master is handling hundreds of SQL commands in parallel. It is when the SQL statement are lined up single-file and recorded as a FIFO queue on the Master, that the Slave must then process each command first-come, first serve only without any benefits from parallelism.

Conclusions Drawn

From the roles of Master and Slave given an all-InnoDB database, a Master to could handle multiple SQL transactions and be done with all commands. By design a Slave has no choice but to process commands in a serialized fashion. The background processing performed by a Master would easily get finished a stay quiet for a long time while a rather busy Slave would have process SQL one-by-one and have resources handle processed changes in a similar manner.

Troubleshooting / Resolution

Once you can see this funneling of SQL from Master to Slave, your only recourse would be to consider some of the following options to apply to the Slave and Master

  • Query Optimization
  • Create the needed indexes to support Query Optimization
  • Make adjustments in the networking setup between Master and Slave
    • Install a 2nd NIC on Master and Slave
    • Assign IP address to the 2nd NIC (NIC_IP)
    • Run CHANGE MASTER TO command to assign (NIC_UP) to MASTER_HOST
  • Upgrade to MySQL 5.5 (if not using MySQL 5.5)
  • Configure MySQL 5.5
  • Upgrade RAM
  • Change to Individual Tablespaces