Mysql – Slave SQL thread got hanged

MySQLreplication

We have a master – slave setup with ROW based replication.
We are seeing huge delay's on the salve even though there is no activity running wither on master or slave.

When we looked in, we observed the SQL thread looks like hanged. It has been in "Reading event from the relay log" state since last 3 hours or more.

baleaf:(none)> show processlist ;
+--------+-------------+-----------+------+---------+-------+----------------------------------+-----    -------------+
| Id     | User        | Host      | db   | Command | Time  | State                            |        Info             |
+--------+-------------+-----------+------+---------+-------+----------------------------------+-----    -------------+
| 217159 | system user |           | NULL | Connect |  1039 | Waiting for master to send event | NULL             |
| 217160 | system user |           | NULL | Connect |  10045 | Reading event from the relay log | NULL             |
+--------+-------------+-----------+------+---------+-------+----------------------------------+-----    -------------+
4 rows in set (0.00 sec)



baleaf:blackarrow_dw> SHOW SLAVE STATUS \G
*************************** 1. row ***************************
               Slave_IO_State: Queueing master event to the relay log
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: binlog.001403
          Read_Master_Log_Pos: 95601911
               Relay_Log_File: relay-bin.000002
                Relay_Log_Pos: 12757514
        Relay_Master_Log_File: binlog.001403
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB: 
          Replicate_Ignore_DB: 
           Replicate_Do_Table: 
       Replicate_Ignore_Table: 
      Replicate_Wild_Do_Table: 
  Replicate_Wild_Ignore_Table: 
                   Last_Errno: 0
                   Last_Error: 
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 32820355
              Relay_Log_Space: 75539220
              Until_Condition: None
               Until_Log_File: 
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File: 
           Master_SSL_CA_Path: 
              Master_SSL_Cert: 
            Master_SSL_Cipher: 
               Master_SSL_Key: 
        Seconds_Behind_Master: 7720
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error: 
               Last_SQL_Errno: 0
               Last_SQL_Error: 
  Replicate_Ignore_Server_Ids: 
             Master_Server_Id: 1
1 row in set (0.00 sec)

Shall some one please have a look into this ASAP.

Best Answer

You may need to reset the relay logs. PLEASE DON'T USE RESET SLAVE;

You need to reset the relay logs in such a way that it picks up from the last SQL statement it executed.

Please run the the following:

STOP SLAVE;
CHANGE MASTER TO MASTER_LOG_FILE='binlog.001403',MASTER_LOG_POS=32820355;
START SLAVE;

If Seconds_Behind_Master stays at zero, CONGRATULATIONS !!!

If Seconds_Behind_Master goes back to 7720 immediately, check the date and time set in the OS on the Master against the the date and time set in the OS on the Slave.

Give it a Try !!!