For the sake of simplicity, I recommend MySQL Circular Replication only. Here is why:
There are many technologies and topologies that are far superior to MySQL Circular Replication. My favorite, hands down, is DRBD (Distributed Replicated Block Device). However, DRBD works great when the Server Pair is in the same bulding, data center, and rack. It's even better when using a crossover cable in the 192.168.x.x. subnet between the DRBD Primary and DRBD Secondary. Unfortunately, DRBD has horrible performance over a distance between two locations, although DRBD can still work. There are no network topologies around to give you the satisfactory DRBD performance needed between two datacenters.
Once you setup MySQL Circular Replication between the two DB servers in two different data centers, the only tuning needed is for the network. In essence, the replication performance is a function of network settings (speed/latency of binary log transmission in MySQL Replication Setup) and disk I/O (DRBD).
An alternative you may want for better redundancy is the following for the sake of example:
Setup a DRBD Pair in both locations
DRBD Pair in Site #1 with VIP 111.111.111.111
DRBD Pair in Site #2 with VIP 222.222.222.222
Setup MySQL Circular Replication between the DRBD Primary Servers under these conditions:
For site #1, use 222.222.222.222 as the Master_Host in MySQL
For site #2, use 111.111.111.111 as the Master_Host in MySQL
Although introducing a level of complexity, now you have two levels of redundancy: DRBD within each site and MySQL Circular Replication between sites. You have the additional benefits of running backups via mysqldump on the DRBD Primary of the hot standby server.
As for failover, DRBD provides automatic failover at any one site.
Only in the event that a datacenter is totally unavailble would you using the DB VIP at the hot standby site.
UPDATE
I just did a double take and noticed that you are using Drupal6. I am glad you will be converting all the drupal tables to InnoDB. This will remove any chance of MyISAM table updates causing table locks to freeze DB Connections that are simply reading MyISAM tables. Any DML update (INSERTs, UPDATEs, DELETEs) against a MyISAM table WILL ALWAYS DO A FULL TABLE LOCK !!! Using InnoDB will introduce row-level locking, which eliminates full table locks.
In addition, DRBD becomes your friend when everything is InnoDB because crash recovery will be consistent between the DRBD Pair. Contrawise, DRBD with MyISAM buys you nothing because a crashed MyISAM table on the DRBD Primary is simply duplicated to the DRBD Secondary as, you guessed it, a crashed MyISAM table.
UPDATE #2
You should use two levels of redundancy
Level 1 : At each database center, use DRBD.
http://dev.mysql.com/doc/refman/5.1/en/ha-drbd.html
Set up a pair of DB Servers
Startup DRBD
Startup MySQL on the DRBD Primary
This creates redundant data at the disk level.
Level 2 : You should setup MySQL Circular Replication between
the DRBD Primary of DataCenter#1 and the DRBD Primary of DataCenter#2
Each DRBD Primary will be running MySQL and will act
as both Master and Slave to Each Other
I have setup for clients topologies like this and I consider it quite stable.
In our shop we selected repmgr and pgbouncer instead of pgpool. repmgr has some nice tooling to setup and maintain the cluster of replicated database servers. In our case 1 master and 2 slaves (one failover and one live read performance test that can become the failover of the new master). pgpool has issues with changes in the configuration, in most cases you have to restart the service and therefor you have some downtime. This is a problem when you need 24x7x365 availability.
repmgrd (the deamon) helps to select the new master after a failover, you really don't want a split brain situation. We have one virtual ip-address for the master database, the database that is master at that moment. When another server becomes master, this is the only server using this address. Every database server also has it's own ip-address for read only queries.
repmgr is maintained by the same guys that created streaming replication in the first place, so they know what they talk about. Version 2.0 is about to be released.
Prepare for the worst situation, do some serious testing by pulling some power and network plugs! When something goes wrong, many other things already went wrong and will bite you in the back when you can't afford it.
Replication is one thing, a working failover after some serious problems, is another thing.
Best Answer
Given the fact that you mentioned you have
Master-Master replication mode
, I would not recommend any automatic failover unless you properly account for Replication Lag. After all, MySQL Replication is asynchronous.It is theoretically possible to have the following:
DBServer1
as Master toDBServer2
DBServer2
as Master toDBServer1
DBServer2
is 180 seconds behindDBServer1
goes downDBServer2
With this scenario,
DBServer2
could have auto increment keys that do not exist yet. Upon failover, the DBVIP will allow WebServers to connect toDBServer2
and ask for data that does not exist yet.This would therefore require background processes running on each DBServer.
For the above scenario:
DBServer1
DBServer1
runs HeartBeatDBServer2
runs HeartBeatDBServer1
to monitorBackground Process on
DBServer2
to make sure DBVIP is pingableWhat should killing HeartBeat do? Trigger the startup script defined for it.
What should the startup script on
DBServer2
look for?SHOW SLAVE STATUS\G
in a Loop untilSeconds_Behind_Master
isNULL
SHOW SLAVE STATUS\G
in a Loop until `Exec_Master_Log_Pos stops changingDBServer2
viaip addr add
This is essentailly the algorithm for failing over safely to a Passive Master in a Master/Master Replication Cluster.
ALTERNATIVE
If ALL your data is InnoDB, I recommend something with less rigor. Perhaps you should look into using DRBD and HeartBeat. Here is why:
DRBD provides network RAID-1 for a Block Device on two servers.
You would essentially do this:
DBServer1's
DRBD Block Device as PrimaryDBServer2's
DRBD Block Device as SecondaryDBServer1's
DRBD Device on /var/lib/mysqlDBServer1
What would startup script look like in a DRBD scenario?
ip addr add
This is a lot more straightforward because only one side is Active. The Passive side (DRBD Secondary) is a Synchronous Disk Copy of the Active Side (DRBD Primary).
CAVEAT
If all or most of the working set data is MyISAM, do not touch DRBD. Crash scnearios quickly result in MyISAM tables being marked crashed and need auto-repair (which can be paintfully slow to wait for).
UPDATE 2012-12-29 08:00 EDT
Here are my past posts on using DRBD with MySQL
Mar 29, 2011
: MySQL high availability, failover and replication with LatencyAug 29, 2011
: MySQL Replication : 1 Slave / Multiple MastersDec 19, 2011
: Best way to setup master to multi master replicationJul 25, 2012
: Mysql database replication on different vlan/subnet/another site (I stirred a pretty big pot on this one in someone else's blog)