Mariadb – The right way of starting up a Galera Cluster

clusteringgaleramariadb

I am confused about the bootstrap process of MariaDB Galera Cluster.

Let's say we have 3 nodes of a MariaDB Galera Cluster, working and functioning normally on CentOS6/7. However, at some point we have (doesn't matter by what reason) to stop all 3 nodes and then start up the cluster. We are stopping the nodes 1 by 1 normally and cleanly. In order to start up the cluster, I go and check out grastate.dat and see which node is with most current data. I bootstrap this node first.

What is unclear to me is how the bootstrap is done exactly for CentOS 6 and 7. The 2nd and 3rd node can be started with systemctl start mysql. How do we start the first node? Do we use service mysql start --bootstrap on the first node with most current data? I read that for CentOS 7 it's systemctl start mysql@bootstrap.service. What is the procedure of boostrapping the first node of the cluster after we stopped all 3 nodes normally?

PS: Related to this, with some more details, is the post by dr01:
grastate.dat with seqno -1 on a healthy cluster. Why?

Best Answer

At MariaDB 10.1 where Galera included you need to start first node with key --wsrep-new-cluster

e.g. /etc/init.d/mysql start --wsrep-new-cluster

All other nodes starts as usually:

/etc/init.d/mysql start

To determine which node needs to be bootstrapped, compare the wsrep_last_committed value on all DB nodes:

KVM-1> SHOW STATUS LIKE 'wsrep_%';
+----------------------+-------------+
| Variable_name        | Value       |
+----------------------+-------------+
| wsrep_last_committed | 21568       |
...
| wsrep_cluster_status | non-Primary |
+----------------------+-------------+

KVM-2> SHOW STATUS LIKE 'wsrep_%';
+----------------------+-------------+
| Variable_name        | Value       |
+----------------------+-------------+
| wsrep_last_committed |  1359       |
...
| wsrep_cluster_status | non-Primary |
+----------------------+-------------+



KVM-3> SHOW STATUS LIKE 'wsrep_%';
+----------------------+-------------+
| Variable_name        | Value       |
+----------------------+-------------+
| wsrep_last_committed |   537       |
...
| wsrep_cluster_status | non-Primary |
+----------------------+-------------+

KVM-1 has the most up-to-date data. In this case, all Galera nodes are already started, so you don’t necessarily need to bootstrap the cluster again. We just need to promote KVM-1 to be a Primary Component:

KVM-1> SET GLOBAL wsrep_provider_options="pc.bootstrap=1";

The remaining nodes will then reconnect to the Primary Component (KVM-1) and resyncing back data based on this node.