I have migrated my single MariaDB setup to a 3 node Galera cluster. I am currently in staging and most of normal operations run fine.
I have one issue though: if I do a mass update of one column of one table (500K rows) then the new value is available on the node where I ran the SQL UPDATE immediately after the transaction is done while it takes several seconds to propagate to the other nodes.
I thought Galera cluster was 100% synchronous and thus suprised to see that as it means that users could see different data…
My conf is:
[mysqld]
binlog_format=ROW
default_storage_engine=InnoDB
innodb_flush_log_at_trx_commit=0
innodb_autoinc_lock_mode=2
innodb_doublewrite=1
query_cache_size=0
wsrep_provider=/usr/lib/libgalera_smm.so
wsrep_provider_options="gcache.size=300M; gcache.page_size=300M"
wsrep_cluster_name="xxxxx"
wsrep_cluster_address="gcomm://192.168.0.100,192.168.0.101,192.168.0.102"
wsrep_node_address="192.168.0.102"
wsrep_sst_auth="sst:yyyyyy"
wsrep_sst_method=xtrabackup-v2
wsrep_on=ON
Is this really expected or is there something to tweak?
Thanks!
Best Answer
Galera replication is not synchronous, only what they call 'virtually synchronous'. It's synchronous only up to the point where the write set is to be applied.
Source: Multi-Master Replication (Percona XtraDB Cluster - this page also has a diagram which explains the problem).
Another approach is to define one of the nodes as the 'master', and then write only to that node. Critical reads should then also be directed at the 'master' node, whereas less-critical reads can go to any node.
PS: Note that if you plan to write to all the nodes, you might find the following article useful: Galera’s big gotcha for MySQL users.
Edit:
Note that
wsrep_causal_reads=ON
has been deprecated bywsrep_sync_wait=1
.