MySQL Slave replication: Need to determine where replication left off to start slave in correct location

MySQLreplication

Idiotically I rebooted a machine that is used as a MySQL slave without running STOP SLAVE, FLUSH TABLES first.

I had thought MySQL automatically handles all of this during a machine reboot, but apparently it doesn't, at least not in the configuration I was using, because the SLAVE doesn't start back up.

mysqld does start but there is an error message in the logs indicating the slave portion stops due to duplicate primary key issues. Meaning it is trying to insert data that has already been added.

Here are the errors generated in the mysql log:

120104 11:07:54 [Warning] Slave: Duplicate entry '94459' for key 'PRIMARY' Error_code: 1062
120104 11:07:54 [ERROR] Error running query, slave SQL thread aborted.
    Fix the problem, and restart the slave SQL thread with "SLAVE START".
    We stopped at log 'mysql1-bin.000362' position 3384732

How do I determine where in the master binary log the process needs to start from to run a CHANGE MASTER statement? I know I could potentially, skip log entries using the sql_slave_skip_counter but without knowing how many to skip I would need to go one by one, and that could take all day.

Best Answer

The Primary thing you need to do was left in a comment:

Follow the suggestion in Duplicate entry in MySQL slave:

STOP SLAVE;
SET GLOBAL sql_slave_skip_counter=1;
START SLAVE;

However, here is how you can start off at the better position with the CHANGE MASTER TO command.

Let's take a look at a sample SHOW SLAVE STATUS\G

               Slave_IO_State: Waiting for master to send event
                  Master_Host: 10.4.16.245
                  Master_User: replicant
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysql-bin.001527
          Read_Master_Log_Pos: 554619670
               Relay_Log_File: relay-bin.004561
                Relay_Log_Pos: 554619815
        Relay_Master_Log_File: mysql-bin.001527
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB:
          Replicate_Ignore_DB: phpmyadmin
           Replicate_Do_Table:
       Replicate_Ignore_Table:
      Replicate_Wild_Do_Table:
  Replicate_Wild_Ignore_Table:
                   Last_Errno: 0
                   Last_Error:
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 554619670
              Relay_Log_Space: 554620007
              Until_Condition: None
               Until_Log_File:
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File:
           Master_SSL_CA_Path:
              Master_SSL_Cert:
            Master_SSL_Cipher:
               Master_SSL_Key:
        Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error:
               Last_SQL_Errno: 0
               Last_SQL_Error:

Please take note of Relay_Master_Log_File and Exec_Master_Log_Pos. These represent the log file and the position of the last SQL statement that was completed on the Master that is next in line to be executed on the Slave.

So, the CHANGE MASTER TO in this case would be:

CHANGE MASTER TO master_log_file='mysql-bin.001527',master_log_pos=554619670;

For your particular case, here is what you need to do on the Slave Server:

Step 01) Start mysql up with replication disabled at startup

$ service mysql restart --skip-slave-start

Step 02) Login to mysql and show the slave status:

mysql> SHOW SLAVE STATUS\G

Step 03) Get Relay_Master_Log_File and Exec_Master_Log_Pos from SHOW SLAVE STATUS\G

Step 04) Run CHANGE MASTER TO command using Relay_Master_Log_File and Exec_Master_Log_Pos

This will erase any relay logs collected and start collecting with a fresh, empty relay log.

Step 05) mysql> START SLAVE;

Step 06) mysql> SHOW SLAVE STATUS\G repeatedly to watch Seconds_Behind_Master go to 0

Step 07) If replication breaks due to Error 1062 (Duplicate Key), now you can implement @DTest's suggestion. Then, goto Step 06. Repeat this until Seconds_Behind_Master go to 0.

We stopped at log 'mysql1-bin.000362' position 3384732.

That's what you use:

CHANGE MASTER TO master_log_file='mysql1-bin.000362'. master_log_pos=3384732;

If you ran RESET SLAVE on the Slave, do not despair. Just run the full syntax version of the CHANGE MASTER TO command:

CHANGE MASTER TO
MASTER_HOST='IP Address of Master',
MASTER_PORT=3306,
MASTER_USER='Replication Username',
MASTER_PASSWORD='Replication Password',
MASTER_LOG_FILE='mysql1-bin.000362',
MASTER_LOG_POS=3384732;