I am running Percona MySQL 5.5.39 on two masters with keepalived monitoring a VIP which will failover to the standby master if the primary master goes dark. While testing the fault tolerance of my setup in the lab I noticed that I am getting duplicate key entries on both masters.
Basically I run siege to emulate user load, for a duration of five minutes, within that time duration I shutdown the primary master, and after about 10 seconds for keepalived to detect the outtage, the VIP is switched the standby master, at which time everything is working as expected. Then after a minute or so, while siege is still running, I power on the primary master, and when online, it assumes the VIP.
It is during the swapping of the VIP that I think the duplicate entry issue is occurring. I followed this post explaining how to implement 'crash resilient' replication, but after implementing the suggested my.cnf changes and rerunning the user simulation, I ended up with the same broken replication.
There is also the ability for the standby master to increment it's indices by 2 instead of 1 to avoid collisions with the primary master, but I feel this is more of a hack than a solution.
Is there any better way to avoid primary key collisions in an HA environment?
Best Answer
This is not a hack. It is the intention of them: Use
auto_increment_increment
andauto_increment_offset
http://dev.mysql.com/doc/refman/5.6/en/replication-options-master.html#sysvar_auto_increment_increment