I have a master-master MySQL 5.7, everything works fine except for one error which is rising from time to time.
If I run a show slave status\G
on server2 I have an error on a table:
...
Last_Errno: 1032
Last_Error: Could not execute Update_rows event on table my_database_name.my_table_name; Can't find record in 'my_table_name', Error_code: 1032; handler error HA_ERR_END_OF_FILE; the event's master log mysql-bin.000120, end_log_pos 83145706
...
I can temporarily skip the error by running:
STOP SLAVE;
SET GLOBAL SQL_SLAVE_SKIP_COUNTER = 1;
START SLAVE;
or by setting
slave-skip-errors = 1032
skip-slave-start
in server2's my.cnf and then restarting MySQL.
However, I would like to fix it permanently so I checked in server1's binary logs and I found this:
### UPDATE `my_database_name`.`my_table_name`
### WHERE
### @1=140
### @2='2014:02:02'
### @3=2878
### @4=3253
### @5=''
### @6=0.00
### @7=0
### @8=35.75
### @9=0.00
### @10=0
### @11=0
### SET
### @1=140
### @2='2014:02:02'
### @3=2878
### @4=3254
### @5=''
### @6=0.00
### @7=0
### @8=35.75
### @9=0.00
### @10=0
### @11=0
# at 83145706
I see it tries to update the table by setting the value 3254
in the 4th column where the 4th column itself has value 3253. Anyway, if I look at the table on the servers, it has a different value on both:
server1:
mysql> select * from my_table_name where my_table_name.id = 140;
+-----+------------+----------+----------+---------+-----------+---------+---------+-----------+---------------------+---------------------+
| id | col2 | col3 | col4 | col5 | col6 | col7 | col8 | col9 | col10 | col11 |
+-----+------------+----------+----------+---------+-----------+---------+---------+-----------+---------------------+---------------------+
| 140 | 2014-02-02 | 2878 | 3254 | | 0.00 | 0 | 35.75 | 0.00 | 0000-00-00 00:00:00 | 0000-00-00 00:00:00 |
+-----+------------+----------+----------+---------+-----------+---------+---------+-----------+---------------------+---------------------+
server2:
mysql> select * from my_table_name where my_table_name.id = 140;
+-----+------------+----------+----------+---------+-----------+---------+---------+-----------+---------------------+---------------------+
| id | col2 | col3 | col4 | col5 | col6 | col7 | col8 | col9 | col10 | col11 |
+-----+------------+----------+----------+---------+-----------+---------+---------+-----------+---------------------+---------------------+
| 140 | 2014-02-02 | 2878 | 3257 | | 0.00 | 0 | 35.75 | 0.00 | 0000-00-00 00:00:00 | 0000-00-00 00:00:00 |
+-----+------------+----------+----------+---------+-----------+---------+---------+-----------+---------------------+---------------------+
Considering it's master-master, how can I align the tables? I think I can't simply set the value 3253
hoping it's going to be updated by the binary log, can I?
If possible, I want to avoid re-syncing the whole database because it's really huge.
Thank you!
Best Answer
You could use pt-table-checksum and pt-table-sync to perform an optimized synchronisation while online and it will backoff to avoid replication lag.