MySQL Master-Master Replication – Single Table Out of Sync

data synchronizationmaster-master-replicationMySQLmysql-5.7replication

I have a master-master MySQL 5.7, everything works fine except for one error which is rising from time to time.

If I run a show slave status\G on server2 I have an error on a table:

...
Last_Errno: 1032
Last_Error: Could not execute Update_rows event on table my_database_name.my_table_name; Can't find record in 'my_table_name', Error_code: 1032; handler error HA_ERR_END_OF_FILE; the event's master log mysql-bin.000120, end_log_pos 83145706
...

I can temporarily skip the error by running:

STOP SLAVE;
SET GLOBAL SQL_SLAVE_SKIP_COUNTER = 1;
START SLAVE;

or by setting

slave-skip-errors = 1032
skip-slave-start

in server2's my.cnf and then restarting MySQL.

However, I would like to fix it permanently so I checked in server1's binary logs and I found this:

### UPDATE `my_database_name`.`my_table_name`
### WHERE
###   @1=140
###   @2='2014:02:02'
###   @3=2878
###   @4=3253
###   @5=''
###   @6=0.00
###   @7=0
###   @8=35.75
###   @9=0.00
###   @10=0
###   @11=0
### SET
###   @1=140
###   @2='2014:02:02'
###   @3=2878
###   @4=3254
###   @5=''
###   @6=0.00
###   @7=0
###   @8=35.75
###   @9=0.00
###   @10=0
###   @11=0
# at 83145706

I see it tries to update the table by setting the value 3254 in the 4th column where the 4th column itself has value 3253. Anyway, if I look at the table on the servers, it has a different value on both:

server1:

mysql> select * from my_table_name where my_table_name.id = 140;
+-----+------------+----------+----------+---------+-----------+---------+---------+-----------+---------------------+---------------------+
| id  | col2       | col3     | col4     | col5    | col6      | col7    | col8    | col9      | col10               | col11               |
+-----+------------+----------+----------+---------+-----------+---------+---------+-----------+---------------------+---------------------+
| 140 | 2014-02-02 |     2878 |     3254 |         |      0.00 |       0 |   35.75 |      0.00 | 0000-00-00 00:00:00 | 0000-00-00 00:00:00 |
+-----+------------+----------+----------+---------+-----------+---------+---------+-----------+---------------------+---------------------+

server2:

 mysql> select * from my_table_name where my_table_name.id = 140;
+-----+------------+----------+----------+---------+-----------+---------+---------+-----------+---------------------+---------------------+
| id  | col2       | col3     | col4     | col5    | col6      | col7    | col8    | col9      | col10               | col11               |
+-----+------------+----------+----------+---------+-----------+---------+---------+-----------+---------------------+---------------------+
| 140 | 2014-02-02 |     2878 |     3257 |         |      0.00 |       0 |   35.75 |      0.00 | 0000-00-00 00:00:00 | 0000-00-00 00:00:00 |
+-----+------------+----------+----------+---------+-----------+---------+---------+-----------+---------------------+---------------------+

Considering it's master-master, how can I align the tables? I think I can't simply set the value 3253 hoping it's going to be updated by the binary log, can I?

If possible, I want to avoid re-syncing the whole database because it's really huge.

Thank you!

Best Answer

You could use pt-table-checksum and pt-table-sync to perform an optimized synchronisation while online and it will backoff to avoid replication lag.