MySQL Galera Cluster – Mass Update Delay Issues

galeraMySQLreplication

I have migrated my single MariaDB setup to a 3 node Galera cluster. I am currently in staging and most of normal operations run fine.

I have one issue though: if I do a mass update of one column of one table (500K rows) then the new value is available on the node where I ran the SQL UPDATE immediately after the transaction is done while it takes several seconds to propagate to the other nodes.

I thought Galera cluster was 100% synchronous and thus suprised to see that as it means that users could see different data…

My conf is:

[mysqld]
binlog_format=ROW
default_storage_engine=InnoDB
innodb_flush_log_at_trx_commit=0
innodb_autoinc_lock_mode=2
innodb_doublewrite=1
query_cache_size=0
wsrep_provider=/usr/lib/libgalera_smm.so
wsrep_provider_options="gcache.size=300M; gcache.page_size=300M"
wsrep_cluster_name="xxxxx"
wsrep_cluster_address="gcomm://192.168.0.100,192.168.0.101,192.168.0.102"
wsrep_node_address="192.168.0.102"
wsrep_sst_auth="sst:yyyyyy"
wsrep_sst_method=xtrabackup-v2
wsrep_on=ON

Is this really expected or is there something to tweak?

Thanks!

Best Answer

Galera replication is not synchronous, only what they call 'virtually synchronous'. It's synchronous only up to the point where the write set is to be applied.

There might be a small period of time when a slave is out of sync. This happens because the master may apply events faster than the slave. And if you do read from the slave, you may read the data that has not changed yet.

However, this behavior can be changed by setting the wsrep_causal_reads=ON variable. In this case, the read on the slave will wait until the event is applied (this will obviously increase the response time of the read). The gap between the slave and the master is the reason why this replication is called virtually synchronous replication, and not real synchronous replication.

Source: Multi-Master Replication (Percona XtraDB Cluster - this page also has a diagram which explains the problem).

Another approach is to define one of the nodes as the 'master', and then write only to that node. Critical reads should then also be directed at the 'master' node, whereas less-critical reads can go to any node.

PS: Note that if you plan to write to all the nodes, you might find the following article useful: Galera’s big gotcha for MySQL users.

Edit:

Note that wsrep_causal_reads=ON has been deprecated by wsrep_sync_wait=1.