Data Warehousing (your app) and OLTP are somewhat different. So, I question the relevancy of the benchmark. OLTP, if I am not mistaken, involves very short 'transactions', which can suffer from the cross-node syncing done at COMMIT
by Galera's wsrep. On the other hand, your insertion of new datapoints could be done in batches of, say, 1000, and not have to worry about the commit overhead.
With 60GB, innodb_buffer_pool_size
should be more like 45G. And set innodb_buffer_pool_instances=16
.
On the surface, 19K/sec (per node?) seems faster -- 19K*3 > 54K. But maybe I am misreading things.
Let's see your SHOW CREATE TABLE
for the Fact table. There are a number of mistakes that can cause severe performance problems in your type of application.
When do you do the "consolidating"? And how? Summary table (which I assume is what you are talking about) is key to efficiency. Sometimes it is even better store the summary data, yet toss the raw data.
Check out my discussion of High speed ingestion and Summary tables. My point in those are that there are significant speedups that can (and should) be made aside from the underlying structure (hardware, Galera, etc).
Your Galera settings are in the right direction; there may be extra tips in Galera.
More questions (which will lead to more discussion): How many clients are pumping data into the table? How many rows are inserted at once? How easy would it be to gather the rows and batch-insert them? Is any Normalization occurring during the insertion? Does it make sense to do the summarization in parallel with the insertion? Did you intend to have clients writing to all 3 nodes? Do you have some form of Proxy between the clients and the nodes?
At MariaDB 10.1 where Galera included you need to start first node with key --wsrep-new-cluster
e.g.
/etc/init.d/mysql start --wsrep-new-cluster
All other nodes starts as usually:
/etc/init.d/mysql start
To determine which node needs to be bootstrapped, compare the wsrep_last_committed
value on all DB nodes:
KVM-1> SHOW STATUS LIKE 'wsrep_%';
+----------------------+-------------+
| Variable_name | Value |
+----------------------+-------------+
| wsrep_last_committed | 21568 |
...
| wsrep_cluster_status | non-Primary |
+----------------------+-------------+
KVM-2> SHOW STATUS LIKE 'wsrep_%';
+----------------------+-------------+
| Variable_name | Value |
+----------------------+-------------+
| wsrep_last_committed | 1359 |
...
| wsrep_cluster_status | non-Primary |
+----------------------+-------------+
KVM-3> SHOW STATUS LIKE 'wsrep_%';
+----------------------+-------------+
| Variable_name | Value |
+----------------------+-------------+
| wsrep_last_committed | 537 |
...
| wsrep_cluster_status | non-Primary |
+----------------------+-------------+
KVM-1 has the most up-to-date data. In this case, all Galera nodes are already started, so you don’t necessarily need to bootstrap the cluster again. We just need to promote KVM-1 to be a Primary Component:
KVM-1> SET GLOBAL wsrep_provider_options="pc.bootstrap=1";
The remaining nodes will then reconnect to the Primary Component (KVM-1) and resyncing back data based on this node.
Best Answer
As far as I read the right way is to use the init.d/systemd scripts of MySQL/MariaDB in order to stop a node. This will inform WSREP/Galera that the replication must stop and will save on the disk, in the MySQL/MariaDB data folder, in a file the last transaction number that the node has committed (So that next time it will boot, the nodes will compare their last committed transaction and sync).
You have to stop a node and wait a little bit (like a minute or so) before to stop the next node, to be sure the other nodes see that.
After that, it is very important to start again the nodes in the reverse ordre, so that the last node you shutdown is the first booting one, having the most up-to-date data. Then the next nodes will be able to sync with the first one.
Using
wsrep_on='OFF'
doesn't sound right to me, WSREP is embedded in MySQL, so when MySQL gracefully stops, WSREP knows what it has to do.