I recently set up a MariaDB galera cluster for our production. I used sysbench to benchmark the cluster against the old database which is on a single server.
On my PRD Galera Cluster I got the following results:
SQL statistics:
queries performed:
read: 3914980
write: 0
other: 782996
total: 4697976
transactions: 391498 (1304.77 per sec.)
queries: 4697976 (15657.22 per sec.)
ignored errors: 0 (0.00 per sec.)
reconnects: 0 (0.00 per sec.)
General statistics:
total time: 300.0492s
total number of events: 391498
Latency (ms):
min: 5.37
avg: 12.26
max: 66.20
95th percentile: 15.83
sum: 4798745.23
Threads fairness:
events (avg/stddev): 24468.6250/414.77
execution time (avg/stddev): 299.9216/0.01
Meanwhile our old single database production got this results:
SQL statistics:
queries performed:
read: 5306060
write: 0
other: 1061212
total: 6367272
transactions: 530606 (1768.51 per sec.)
queries: 6367272 (21222.18 per sec.)
ignored errors: 0 (0.00 per sec.)
reconnects: 0 (0.00 per sec.)
General statistics:
total time: 300.0266s
total number of events: 530606
Latency (ms):
min: 3.87
avg: 9.04
max: 59.99
95th percentile: 12.08
sum: 4798278.00
Threads fairness:
events (avg/stddev): 33162.8750/440.14
execution time (avg/stddev): 299.8924/0.01
Now I'm wondering why does the cluster operate a bit slower than the single database? They have the same specs: Quadcore CPU, 32GB RAM and vm.swappiness=1. Here's my cluster configuration (same across 3 servers) and is using HAProxy to load balance between 3 servers:
max_connections = 3000
wsrep_slave_threads=4
innodb_lock_wait_timeout=8000
innodb_io_capacity=2000
innodb_buffer_pool_size=25G
innodb_buffer_pool_instances=25
innodb_log_buffer_size=256M
innodb_log_file_size=1G
innodb_flush_log_at_trx_commit=2
innodb_flush_method = O_DIRECT_NO_FSYNC
innodb_read_io_threads=8
innodb_write_io_threads=4
thread_handling = pool-of-threads
thread_stack = 192K
thread_cache_size = 4
thread_pool_size = 8
thread_pool_oversubscribe = 3
wsrep_provider_options="gcache.size=10G; gcache.page_size=10G"
I used sysbench on a spare server, does the latency between servers also affect the outputs? I would appreciate any inputs, thank you.
Best Answer
Galera being slower for that benchmark is not surprising. But is it useful info?
At the end of a transaction, in the
COMMIT
, the node being written communicates with each other node, asking "will you guarantee that this transaction will succeed?" (This is slightly weaker, but faster, than actually waiting for the writes to finish on the other nodes.)What's in the 12 queries that make up a "transaction"? Does that reflect your usage?
If I munch the numbers correctly, the stand-alone test has 200 transactions running at any moment; 100 for Galera. Is that realistic? I hope not -- your quadcore machine will begin to stumble over itself after about 4 transactions. I would expect it to hit a ceiling long before 200/100 -- This ceiling often characterizes itself by transaction-per-second improving no further, while latency goes through the roof.
Are you hitting all nodes, or just one?
Does a "transaction" have
SELECTs
in support of the transaction. I ask this because that is a situation where running transactions on multiple nodes can actually increase throughput. (On a solo machine, all the selects would have to be done on the Primary.)Here's an analogy that makes me question benchmarks. Benchmarks are like testing a hot car -- Drive it for an hour with the accelerator on the floor. Not realistic.
If you need anywhere near 1768 TPS, neither topology will survive long.