I am using MySQL-5.1.50 with a Master-slave replication setup.
Most of the time the slave is lagging behind the master.
When I run show processlist;
, there is no query that's taking a long time. I enabled slow_log
as well. However, it does not find any slow running query.
The slave is continuously giving alerts that replication is seconds behind the master. Sometimes, the lag time increases.
How do I diagnose the cause of the problem?
I need urgent help, since this problem has persisted for the last 20 days.
Best Answer
The Seconds_Behind_Master is really like viewing the past via time travel.
Think of it this way:
In like manner, it seems that the Master is processing a lot of queries at the same time.
You look back at the Slave, run
SHOW SLAVE STATUS\G
and it says 200 forSeconds_Behind_Master
. How is that number calculated? Slave's Clock Time (UNIX_TIMESTAMP(NOW()) - TIMESTAMP of the Query when it was completed and recorded in the Master's Binary Log.There is another metric to look at besides
Seconds_Behind_Master
. That metric is calledRelay_Log_Space
. That represents the sum of all bytes for all relay files on the Slave. By default, the largest single relay log is limited to 1GB. IfRelay_Log_Space
is less than 1GB, this indicates that many long running queries executed on the Master in parallel. Unfortunately, due to the single-threaded nature Replication's SQL thread, queries are executed one behind the other.For example, suppose you have the following scenario on the Master:
When the Slave reads those queries from its relay log and processes them one by one
Seconds_Behind_Master
Concerning the Slow Log, the default for long_query_time is 10 seconds. If all your queries in the relay logs are less than 10 seconds, you will never catch anything in the Slow Query Log.
I have the following recommendations for both Master and Slave servers
Apr 26, 2012
: Is the CPU performance relevant for a database server?Sep 20, 2011
: Multi cores and MySQL PerformanceSep 12, 2011
: Possible to make MySQL use more than one core?May 26, 2011
: About single threaded versus multithreaded databases performanceSeconds_Behind_Master
.FURTHER TROUBLESHOOTING
If you want to see the queries causing the replciation lag, do the following:
SHOW SLAVE STATUS\G
Relay_Log_File
STOP SLAVE;
START SLAVE;
cd /var/lib/mysql
or wherever the relay logs are writtenFor example, Let's do
SHOW SLAVE STATUS\G
If I run
STOP SLAVE; START SLAVE;
, the relay log closes and a new one is open. Yet, you wantrelay-bin.000030
.Dump the contents as follows:
You can now see the queries the Slave is currently trying to process. You can use those queries as the starting point for tuning.