You're correct. If the slaves are in sync to the same point on Alpha, START SLAVE UNTIL
would serve no purpose.
Effectively, they're having you do is stop the slave on Gamma so that it is at the same point in Alpha's binlog as Beta. In a live, high traffic condition, that can be slightly trickier to do, so I assume that's their motivation for suggesting that approach.
If Gamma and Beta have identical Relay_Master_Log_File
and Exec_Master_Log_Pos
values, then they are at the same point relative to Alpha, and you can safely switch Gamma to Beta at Beta's current master log file and position.
However... to state the (presumably) obvious, you can't be doing any writing on Beta (other than via replication) around the time you make the cut, because if you're writing to Beta, then Gamma would miss any events that hadn't replicated from Beta to Gamma via Alpha... but the START SLAVE UNTIL
wouldn't help you with that, either.
This means you need to actually verify that all three servers are in agreement in all possible directions, so there's nothing outstanding from Beta back to Alpha. Alpha's Exec_Master_Log_Pos
(SHOW SLAVE STATUS
) needs to match Beta's Position
(SHOW MASTER STATUS
) and both Gamma and Beta need to match Alpha and each other in the opposite direction (and of course the actual names of the binlogs need to match, not just the positions).
If you want to be super safe and paranoid, verify that everything is in agreement, then issue FLUSH TABLES WITH READ LOCK;
on both Alpha and Beta (leaving the connections open to hold that global lock, which is yours once the prompt returns), then verify that everything remains in agreement at the same binlog positions with nothing written to either master server's binlog that the other hasn't already played back. Stop, re-home and restart replication on Gamma, verifying that it is connected and waiting for incoming events. Finally, verify agreement has persisted in each direction between Alpha and Beta once again, then UNLOCK TABLES;
on Alpha and Beta.
Note that above, I never stopped the slave on Beta. It's not really necessary if you have a global read lock, since the slave is stalled and binlog coordinates can't advance under that condition, but you could stop it if desired, so long as the coordinates continue to match Alpha.
You should put everything on a level playing field. How ?
Without proper tuning, it is possible for older versions of MySQL to outrun and outgun new versions.
Before running SysBench on the three environments
- Make sure all InnoDB settings are identical for all DB Servers
- For the Master/Slave, run
STOP SLAVE;
on the Slave
- For PXC (Percona XtraDB Cluster), shutdown two Masters
Compare the speeds of just standalone MySQL, Percona, and MariaDB.
ANALYSIS
If MySQL is best (Percona people, please don't throw rotten vegetables at me just yet. This is just conjecture), run START SLAVE;
. Run SysBench on the Master/Slave. If the performance is significant slower, you may have to implement semisynchronous replication.
If PXC is best, you may need to tune the wsrep settings or the network itself.
If MariaDB is best, you could switch to MariaDB Cluster (if you have the Money) or setup Master/Slave with MariaDB. Run Sysbench. If the performance is significant slower, you may need to tune the wsrep settings or the network itself.
Why tune wsrep settings ? Keep in mind that Galera wsrep (WriteSet Replication) uses virtually synchronous commits and rollbacks. In other words, either all nodes commit or all nodes rollback. In this instance, the weakest link would have to be
- how fast the communication between Nodes happens (especially true if the Nodes are in different data centers)
- if any one node has underconfigured hardware settings
- if any one node communicates slower than other node
Side Note : You should also make sure tune MySQL for multiple CPUs
UPDATE 2014-11-04 21:06 EST
Please keep in mind that Percona XtraDB Cluster does not write scale very well to begin with. Note what the Documentation says under its drawbacks (Second Drawback):
This can’t be used as an effective write scaling solution. There might be some improvements in write throughput when you run write traffic to 2 nodes vs all traffic to 1 node, but you can’t expect a lot. All writes still have to go on all nodes.
SUGGESTION #1
For PXC, turn off one node. Run SysBench against a two node cluster. If the write performance is better than a three node cluster, then it is obvious that the communication between the nodes is the bottleneck.
SUGGESTION #2
I noticed you have a 42GB Buffer Pool, which is more than half the server's RAM. You need to partition the buffer pool by setting innodb_buffer_pool_instances to 2 or more. Otherwise, you can expect some swapping.
SUGGESTION #3
Your innodb_log_buffer_size is 8M by default. Try making it 256M to increase log write performance.
SUGGESTION #4
Your innodb_log_file_size is 512M. Try making it 2G to increase log write performance. If you apply this setting, then set innodb_log_buffer_size to 512M.
Best Answer
I will start with only 2 server with an active/passive master-master setup and a loadbalancer in front. I don't see any benifits for the cluster and I don't want a 3rd node or the active master waiting for any other nodes.