The "Time" in the SQL thread is (I think) identical to Seconds_behind_master. It is "How long ago did this query start on the Master ".
All other Times are indicate when the query started on the Slave.
Some fluctuation is caused by what it is measuring (the Master's start time).
Sometimes (rarely), I see the value (both places) bouncing between 0 and some large value. I have yet to track this down. I have seen it on 4.0, 4.1, and 5.1. It eventually goes away, and becomes civilized.
There may be cases where no traffic leads to strange values. But I don't have any Master-Slave setups with little enough traffic for me to comment.
Suppose you do ALTER on the Master, and it took 1 hour (3600 seconds). Also, suppose not much else is going on. The ALTER replicates and starts running. Immediately, the Seconds_behind_master will be about 3600. After the ALTER finishes on the Slave (say, 3600 more seconds later), subsequent replication items will execute with (probably) smaller Times. Eventually replication catches up.
Having replication with only one system user
sounds a lot like MySQL 4.1 replication. It was single threaded. That does not sound anything like MySQL 5.x replication.
When you run SHOW SLAVE STATUS\G
, you will see something like this:
mysql> SHOW SLAVE STATUS\G
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 10.48.20.253
Master_User: replicant
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.000262
Read_Master_Log_Pos: 803779735
Relay_Log_File: relay-bin.067799
Relay_Log_Pos: 402744
Relay_Master_Log_File: mysql-bin.000262
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 803779735
Relay_Log_Space: 402598
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: 0
1 row in set (0.00 sec)
Please note the two lines
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
When you see Yes to both of these lines, there must exist two process IDs with system user
as the user. If you only see one, then either the IO thread (network connection back to the Master) is gone, or the SQL thread (load thread to read relay logs) encountered a MySQL error.
Thread States for Replication
- IO is Yes and SQL is Yes : Replication should be running
- IO is No and SQL is Yes : Network Connection to Master is gone
- IO is Yes and SQL is No : SQL Error reading from Relay Logs
- IO is No and SQL is No
- Someone or Something ran
STOP SLAVE;
- MySQL started up with
skip-slave-start
configured
If both Slave_IO_Running
and Slave_SQL_Running
are Yes
and Relay_Log_Space
keeps on increasing, then my guess would be that LOAD DATA INFILE
was executed on the Master and all the data is being shipped through the relay logs as a single unit (See my post MySQL 5.6 showing wrong "second_behind_master").
The only other thing that can cause
Slave_IO_Running
to be Yes
Slave_SQL_Running
to be Yes
Relay_Log_Space
to be be increasing
Would be a very heavy write transaction with row-based replication executed as a group commit.
UPDATE 2014-11-17 17:30 EST
Based on the chat room conversation, we have discussed
- Over 2TB of InnoDB data, so setting up replication from scratch is out of the question
- Discovered that binary logging is enabled on the Slave
- I proposed that binary logging on the Slave maybe causing the lag
- I recent encountered this with a client
- They lag one extra second every two seconds
- I disabled binary logging on the Slave and it worked
- My suggestion for this post is to disable binary logging on the Slave like this
STOP SLAVE;
SET GLOBAL innodb_fast_shutdown = 0;
vi /etc/my.cnf
- Comment out log-bin=master-bin
service mysql restart
START SLAVE;
Best Answer
See the MySQL documentation or a Step by step replication setup blog if you need to review.
If these doesn't help please share the output of
show slave status\G
(as such it appears connected) and also output ofmy.cnf
of master/slave. Mainly you need to worry about server-ids, replication filters (replicate-to-db/table etc)