MariaDB Galera Cluster wsrep_cluster_address not working

clusteringgaleramariadb

I have configured MariaDB galera cluster with three nodes on CentOS-6.5 64bits:

db1 - 192.168.1.111
db2 - 192.168.1.112
db3 - 192.168.1.113

My /etc/my.cnf.d/server.cnf file is configured as follows:

[galera]
binlog_format=ROW
default-storage-engine=innodb
innodb_autoinc_lock_mode=2
innodb_locks_unsafe_for_binlog=1
query_cache_size=0
query_cache_type=0
bind-address=0.0.0.0
datadir=/var/lib/mysql
innodb_log_file_size=100M
innodb_file_per_table
innodb_flush_log_at_trx_commit=2

wsrep_provider=/usr/lib64/galera/libgalera_smm.so
wsrep_cluster_address="gcomm://192.168.1.111,192.168.1.112,192.168.1.113"
wsrep_cluster_name='galera_cluster'
wsrep_node_address='192.168.1.112'
wsrep_node_name='db2'
wsrep_sst_method=xtrabackup
wsrep_sst_auth=username:password

All three nodes have the same config (with changes for wsrep_node_address & wsrep_node_name on each appropriately). When i start with the first node and start MariaDB with either of the following command, it works fine and it shows that the cluster is running with one node up:

#/etc/init.d/mysql bootsrap
#/etc/init.d/mysql start --wsrep-new-cluster

But when i try to start MariaDB on the second node with /etc/init.d/mysql start, it fails to start. When i comment out wsrep_cluster_address=… from the config above, then MariaDB on the second node starts but the node is not not part of the cluster. I have done this config before and worked just fine. But when i shutdown the cluster for maintenance and tried to start it again, only the first node will start. Any suggestions to why this is happening?

Best Answer

You don't provide much information, so I am inferring it from what you say and matching it with a common Galera/Percona XtraDB Cluster/MariaDB Cluster misconception and mistake:

"shutdown the cluster for maintenance and tried to start it again, only the first node will start"

When you stop the cluster fully (something that shouldn't happen under normal circumstances), you need to start the cluster in reverse order. This means: the last node to be down must be the first to start. Why? Because if you are making changes on all nodes, the last one to go down is the one that will get more ahead of the other regarding the transaction UUID. When you shutdown a node normally, the other nodes survive and continue receiving updates. If you now start and create what it is essentially a new cluster from scratch, galera cannot know that the first node that you start is not the "freshest", because you are bootstraping it.

So in short, always check that you do not bootstrap the cluster on a node that is not the last to get shutdown. If you do not know it, here it is a guide on how to do it (along with other useful information on restarting a Galera cluster and other gotchas).

Hope this helps.