I've set up a slave db server for my production database, but when I checked the show slave status, I noticed a super big number in seconds behind master.
This is the output:
Slave_IO_State: Waiting for master to send event
Master_Host: 1.2.3.4
Master_User: replicator
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.000173
Read_Master_Log_Pos: 15909435
Relay_Log_File: mysqld-relay-bin.000079
Relay_Log_Pos: 91173356
Relay_Master_Log_File: mysql-bin.000093
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 91173210
Relay_Log_Space: 8179978166
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: 486330
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 0
Last_SQL_Error:
Replicate_Ignore_Server_Ids:
Master_Server_Id: 1
1 row in set (0.00 sec)
ERROR:
No query specified
Then when I run SHOW PROCESSLIST, I see that the time of the thread matches the time indicated in seconds behind:
mysql> SHOW PROCESSLIST;
| 40 | system user | | NULL | Connect | 66530 | Waiting for master to send event | NULL |
| 41 | system user | | NULL | Connect | 486330 | Reading event from the relay log | NULL |
| 45 | root | localhost | NULL | Query | 0 | NULL | SHOW PROCESSLIST |
That time is dropping, slowly. Read_Master_Log_Pos, Relay_Log_Pos, Exec_Master_Log_Pos and Relay_Log_Space are changing all the time.
I've also check the time/date and both servers are on sync.
On the Master side:
mysql> SHOW PROCESSLIST;
| 66739 | replicator | 1.2.3.5:52884 | NULL | Binlog Dump | 65671 | Master has sent all binlog to slave; waiting for binlog to be updated | NULL
and show slave hosts looks empty…
mysql> SHOW SLAVE HOSTS;
+-----------+------+------+-----------+
| Server_id | Host | Port | Master_id |
+-----------+------+------+-----------+
| 2 | | 3306 | 1 |
+-----------+------+------+-----------+
1 row in set (0.00 sec)
mysql>
So what is actually going on here? Looks like the slave is actually connected and working, but very very slow? Can someone give me some hints on how to do more debug on this? Server is rather idle at 95%.
Best Answer
When you see the
Seconds_Behind_Master
that high, I look at the following:You have 7.6182GB of relay logs to process.
This tell me that you have read up to
mysql-bin.000173
, but you are currently processing things from themysql-bin.000093
.This also tell me you have about 80 binary logs on the Master, each about 100 MB.
The
Seconds_Behind_Master
is simply the NOW() minus the TIMESTAMP set atmysql-bin.000093
(Relay_Master_Log_File) position91173210
(Exec_Master_Log_Pos).As long as Slave_SQL_Thread is Yes, the relay logs get processed
Relay_Log_Space
will decrease each time a relay log is doneExec_Master_Log_Pos
will increase until current relay log is done, then resets to the beginning of the next relaySeconds_Behind_Master
decrease (NOW() minus the TIMESTAMP set at Relay_Master_Log_File position Exec_Master_Log_Pos)This is what happens when replication is off for 486330 seconds (5 days 15 hours 5 minutes 29 seconds) and you run
start slave;
Look at your
SHOW PROCESSLIST;
. The IO Thread has been up for 66530 seconds (18 hours 28 minutes 50 seconds). This means someone or something started replication 18 hours 28 minutes 50 seconds ago.You stated in your question that you have set up replication for the production server. This means that you ran the mysqldump 5 days 15 hours 5 minutes 29 seconds ago and started replicating from the production master 18 hours 28 minutes 50 seconds ago.
If you had setup the Slave the same day you got the mysqldump from the Master the replication load would be a lot less. Notwithstanding, replication is working normally provided
Slave_IO_Thread
andSlave_SQL_Thread
both sayYes
.