MySQL – Fix Slave Skipping Some Updates

galeraMySQLmysql-5.1mysql-5.6percona

I have an existing Mysql 5.1.73 Master/Slave/Slave configuration, which is working correctly and all changes are being replicated.

I have now set up a PerconaDB (mysql+galera+XtraDB) 5.6.30 3-node cluster, which is working correctly.

One of the nodes in the Percona cluster is configured as a Replication Slave to the first cluster. This is being done in order to provide temporary synchronisation for migration purposes; eventually, the Master/Slave cluster will be removed and we will use only the Percona cluster.

If I use the mysql client, and make changes (add tables, add records, update records) on the Master, then this replicates to all clients, and to all nodes in the Percona cluster. So far so good.

However, if I use a certain web application to make changes, then these only replicate to the Master and original slaves, not to any of the Percona nodes.

I have verified that, if I use mysql to make the updates manually, they replicate correctly; however if done via the web application they seem to be filtered out by the PerconaDB slave node. Note that I do not have any visibility of the exact SQL being used by the application, and cannot stop the database in order to enable query logging.

Using SHOW SLAVE STATUS and SHOW MASTER STATUS I can see that the log position is increasing on both master and slave when a change is made.

Can anyone tell me any reason why an update from the webapp would be ignored by the Percona slave thread, when a 'manual' SQL command to do the same update appears to replicate correctly? Is this a version incompatibility, and is there any temporary workaround?

Master configuration (part)

[mysqld]
binlog-format = mixed
binlog-ignore-db = mysql
binlog-ignore-db = information_schema
expire_logs_days = 10
log-bin = /var/lib/mysql/mysql-bin
server-id = 1
skip-external-locking
sync_binlog = 1

Slave configuration (part)

[mysqld]
server-id = 1000
log-slave-updates = 1
relay-log = /var/lib/mysql/mysql-relay-bin.log
log_bin = /var/lib/mysql/mysql-bin.log
slave-skip-error = 1062,1050
replicate-ignore-db = mysql
replicate-ignore-db = information_schema
replicate-ignore-db = performance_schema

Master status:

File: mysql-bin.000027
Position: 1563
Binlog_Do_DB:
Binlog_Ignore_DB: mysql,information_schema

Slave status:

Slave_IO_State: Waiting for master to send event
Master_Host: x.x.x.x
Master_User: replication
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.000027
Read_Master_Log_Pos: 1563
Relay_Log_File: mysql-relay-bin.000012
Relay_Log_Pos: 269
Relay_Master_Log_File: mysql-bin.000027
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Replicate_Do_DB:
Replicate_Ignore_DB: mysql,information_schema,performance_schema
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 1563
Relay_Log_Space: 442
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 0
Last_SQL_Error:
Replicate_Ignore_Server_Ids:
Master_Server_Id: 1
Master_UUID:
Master_Info_File: /var/lib/mysql/master.info
SQL_Delay: 0
SQL_Remaining_Delay: NULL
Slave_SQL_Running_State: Slave has read all relay log; waiting for the slave I/O thread to update it
Master_Retry_Count: 86400
Master_Bind:
Last_IO_Error_Timestamp:
Last_SQL_Error_Timestamp:
Master_SSL_Crl:
Master_SSL_Crlpath:
Retrieved_Gtid_Set:
Executed_Gtid_Set:
Auto_Position: 0

Warnings appearing on the Slave side:

2016-09-07 20:16:26 21354 [Warning] Slave I/O: Notifying master by SET @master_binlog_checksum= @@global.binlog_checksum failed with error: Unknown system variable 'binlog_checksum', Error_code: 1193
2016-09-07 20:16:26 21354 [Warning] Slave I/O: Unknown system variable 'SERVER_UUID' on master. A probable cause is that the variable is not supported on the master (version: 5.1.73-log), even though it is on the slave (version: 5.6.30-76.3-56-log), Error_code: 1193

Best Answer

Found the answer. At some point in the past, someone had reconfigured the 5.1 cluster so that node 2 became the master and node 1 the slave. This meant that I was actually replicating from a slave to the Galera cluster, and the application was writing to node 2.

This should still work; however the old master (node 1) did not have the log-slave-updates setting, and so updates form the application coming in via the (master) node 2 were not replicated on to the Galera cluster.

I changed to replicate from node 2 (the real master) and everything worked as expected.

So, in summary, it was a Layer 8 problem (IE, user error) caused by replicating from a slave without log-slave-updates set.