MySQL Replication – Fix Duplicate Key Error on Fresh Slave

MySQLmysql-5.7replication

I have a master server (MySQL 5.7.18) that used to have a slave server associated with it for replication but the slave server had to be removed. I'm now setting up a new slave server (MySQL 5.7.18) but I'm having issues when I try to start the replication.

I started by using mysqldump to get a copy of the master using this command:

mysqldump \
  --all-databases \
  --single-transaction \
  --triggers \
  --routines \
  --events \
  --hex-blob \
  --socket=mysql.sock \
  --user=user \
  --password \
  | /bin/gzip --best > dbdump.sql.gz

I then copied the dbdump.sql.gz file to the server that the slave will run on and decompressed it.

I then set up a new MySQL instance and imported the contents of the dbdump.sql file.

At this point I checked to make sure the slave isn't already running:

mysql> show slave status\G
Empty set (0.00 sec)

I then issued the CHANGE MASTER statement:

mysql> CHANGE MASTER TO MASTER_HOST='xxx', MASTER_USER='yyy', MASTER_PASSWORD='zzz', MASTER_SSL=1, MASTER_AUTO_POSITION=1;
Query OK, 0 rows affected, 2 warnings (0.01 sec)

Because there were two warnings, I checked them:

mysql> show warnings\G
*************************** 1. row ***************************
  Level: Note
   Code: 1759
Message: Sending passwords in plain text without SSL/TLS is extremely insecure.
*************************** 2. row ***************************
  Level: Note
   Code: 1760
Message: Storing MySQL user name or password information in the master info repository is not secure and is therefore not recommended. Please consider using the USER and PASSWORD connection options for START SLAVE; see the 'START SLAVE Syntax' in the MySQL Manual for more information.
2 rows in set (0.00 sec)

At this point I haven't start the slave yet but I check the status to see where it stands after importing the data:

mysql> show slave status\G
*************************** 1. row ***************************
               Slave_IO_State: 
                  Master_Host: xxx
                  Master_User: yyy
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: 
          Read_Master_Log_Pos: 4
               Relay_Log_File: relay-bin.000001
                Relay_Log_Pos: 4
        Relay_Master_Log_File: 
             Slave_IO_Running: No
            Slave_SQL_Running: No
              Replicate_Do_DB: 
          Replicate_Ignore_DB: 
           Replicate_Do_Table: 
       Replicate_Ignore_Table: 
      Replicate_Wild_Do_Table: 
  Replicate_Wild_Ignore_Table: 
                   Last_Errno: 0
                   Last_Error: 
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 0
              Relay_Log_Space: 154
              Until_Condition: None
               Until_Log_File: 
                Until_Log_Pos: 0
           Master_SSL_Allowed: Yes
           Master_SSL_CA_File: 
           Master_SSL_CA_Path: 
              Master_SSL_Cert: 
            Master_SSL_Cipher: 
               Master_SSL_Key: 
        Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error: 
               Last_SQL_Errno: 0
               Last_SQL_Error: 
  Replicate_Ignore_Server_Ids: 
             Master_Server_Id: 0
                  Master_UUID: 
             Master_Info_File: mysql.slave_master_info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State: 
           Master_Retry_Count: 86400
                  Master_Bind: 
      Last_IO_Error_Timestamp: 
     Last_SQL_Error_Timestamp: 
               Master_SSL_Crl: 
           Master_SSL_Crlpath: 
           Retrieved_Gtid_Set: 
            Executed_Gtid_Set: 8fa9aafc-....-....-....-............:1-3066976
                Auto_Position: 1
         Replicate_Rewrite_DB: 
                 Channel_Name: 
           Master_TLS_Version: 
1 row in set (0.00 sec)

I then start the slave:

mysql> start slave;
Query OK, 0 rows affected (0.01 sec)

After a couple of minutes I check the status of the slave and find the error:

mysql> show slave status\G
*************************** 1. row ***************************
               Slave_IO_State: System lock
                  Master_Host: xxx
                  Master_User: yyy
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysql-bin.000015
          Read_Master_Log_Pos: 21781064
               Relay_Log_File: relay-bin.000002
                Relay_Log_Pos: 367
        Relay_Master_Log_File: mysql-bin.000015
             Slave_IO_Running: Yes
            Slave_SQL_Running: No
              Replicate_Do_DB: 
          Replicate_Ignore_DB: 
           Replicate_Do_Table: 
       Replicate_Ignore_Table: 
      Replicate_Wild_Do_Table: 
  Replicate_Wild_Ignore_Table: 
                   Last_Errno: 1062
                   Last_Error: Could not execute Write_rows event on table DATABASE.TABLE; Duplicate entry '2017-05-01 15:09:28' for key 'PRIMARY', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log mysql-bin.000015, end_log_pos 474
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 194
              Relay_Log_Space: 21781438
              Until_Condition: None
               Until_Log_File: 
                Until_Log_Pos: 0
           Master_SSL_Allowed: Yes
           Master_SSL_CA_File: 
           Master_SSL_CA_Path: 
              Master_SSL_Cert: 
            Master_SSL_Cipher: 
               Master_SSL_Key: 
        Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error: 
               Last_SQL_Errno: 1062
               Last_SQL_Error: Could not execute Write_rows event on table DATABASE.TABLE; Duplicate entry '2017-05-01 15:09:28' for key 'PRIMARY', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log mysql-bin.000015, end_log_pos 474
  Replicate_Ignore_Server_Ids: 
             Master_Server_Id: 8
                  Master_UUID: 8fa9aafc-....-....-....-............
             Master_Info_File: mysql.slave_master_info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State: 
           Master_Retry_Count: 86400
                  Master_Bind: 
      Last_IO_Error_Timestamp: 
     Last_SQL_Error_Timestamp: 170508 18:22:12
               Master_SSL_Crl: 
           Master_SSL_Crlpath: 
           Retrieved_Gtid_Set: 8fa9aafc-....-....-....-............:3066977-3091868
            Executed_Gtid_Set: 8fa9aafc-....-....-....-............:1-3066976
                Auto_Position: 1
         Replicate_Rewrite_DB: 
                 Channel_Name: 
           Master_TLS_Version: 
1 row in set (30.92 sec)

From what I can tell, the transaction that immediately happens after the last imported transaction from the master shows up as a duplicate entry. I've tried skipping the transaction but the next transaction shows the same problem.

I thought with GTID transactions, it should be automatically skipping transactions that have already been processed.

Is there something I'm missing? Is there something wrong with how the master is dumped?

Any help would be greatly appreciated.

Thank you.

Best Answer

I was able to finally get the replication to work by disabling the event scheduler before starting the dump and then re-enabling the event scheduler afterwards.

I'm not sure why would that would be necessary since I was using the --lock-all-tables option in mysqldump which I would have thought would have prevented any changes to the database by events while the dump was occurring.

Ultimately I used this modified command to get the server snapshot:

mysqldump \
  --all-databases \
  --lock-all-tables \
  --triggers \
  --routines \
  --events \
  --hex-blob \
  --flush-logs \
  --flush-privileges \
  --socket=mysql.sock \
  --user=user \
  --password \
  | /bin/gzip --best > dbdump.sql.gz

I took out of the --single-transaction option in favour of using the --lock-all-tables option since I had MYISAM tables.

I also added the --flush-logs and --flush-privileges options since I was taking a snapshot of the entire server these options seemed appropriate. I don't believe these two options should have had any impact on my particular problem.

I'm not sure if this actually fixes the issue or just works out because while the dump is taking place there are no changes being made to the database server but it's here in case someone else can benefit from it.

Thanks.