Given the fact that you mentioned you have Master-Master replication mode
, I would not recommend any automatic failover unless you properly account for Replication Lag. After all, MySQL Replication is asynchronous.
It is theoretically possible to have the following:
DBServer1
as Master to DBServer2
DBServer2
as Master to DBServer1
- DBVIP pointing at DBServer1
DBServer2
is 180 seconds behind
DBServer1
goes down
- Automatic Failover moves DBVIP to
DBServer2
With this scenario, DBServer2
could have auto increment keys that do not exist yet. Upon failover, the DBVIP will allow WebServers to connect to DBServer2
and ask for data that does not exist yet.
This would therefore require background processes running on each DBServer.
For the above scenario:
- DBVIP is on
DBServer1
DBServer1
runs HeartBeat
DBServer2
runs HeartBeat
- Background Process on
DBServer1
to monitor
- a) Data Mount Availability
- b) Data Mount Writeability
- c) MySQL Connectivity
- Once a,b, or c fail, kill HeartBeat
Background Process on DBServer2
to make sure DBVIP is pingable
What should killing HeartBeat do? Trigger the startup script defined for it.
What should the startup script on DBServer2
look for?
- Loop until DBVIP is unreachable via ping
- Connect to MySQL and
- Run
SHOW SLAVE STATUS\G
in a Loop until Seconds_Behind_Master
is NULL
- RUn
SHOW SLAVE STATUS\G
in a Loop until `Exec_Master_Log_Pos stops changing
- Assign DBVIP to
DBServer2
via ip addr add
This is essentailly the algorithm for failing over safely to a Passive Master in a Master/Master Replication Cluster.
ALTERNATIVE
If ALL your data is InnoDB, I recommend something with less rigor. Perhaps you should look into using DRBD and HeartBeat. Here is why:
DRBD provides network RAID-1 for a Block Device on two servers.
You would essentially do this:
- Have
DBServer1's
DRBD Block Device as Primary
- Have
DBServer2's
DRBD Block Device as Secondary
- Mount
DBServer1's
DRBD Device on /var/lib/mysql
- Startup MySQL on
DBServer1
- Have HeartBeat Monitor Ping Activity Between Servers
What would startup script look like in a DRBD scenario?
- Loop until DBVIP is unreachable via ping
- Kill HeartBeart
- Disconnect DRBD
- Promote DRBD to Primary
- Mount DRBD on /var/lib/mysql
- Start MySQL (InnoDB Crash Recovery Fills in Missing Data)
- Assign DBVIP via
ip addr add
This is a lot more straightforward because only one side is Active. The Passive side (DRBD Secondary) is a Synchronous Disk Copy of the Active Side (DRBD Primary).
CAVEAT
If all or most of the working set data is MyISAM, do not touch DRBD. Crash scnearios quickly result in MyISAM tables being marked crashed and need auto-repair (which can be paintfully slow to wait for).
UPDATE 2012-12-29 08:00 EDT
Here are my past posts on using DRBD with MySQL
Best Answer
Jaime's answer is pretty good (and thanks for the credit). However, that only deals very specifically with "Table IO" (i.e latency at the storage engine handler layer), and file IO, per schema.
I'd like to suggest a slightly alternate answer, which gives exactly what was asked for. This deals with statement latency over all, and counts the numbers of each types of statement:
The NULL schema_name is for statements executed without a default database (i.e no "USE dbname").
You will also need sys installed, but could skip the sys.format_time(...), and just select the SUM(...), in which case the value will be in picoseconds for the latency (same as Jaime's).
Since I can't yet comment (my first answer on here!), I'll also note here that performance schema table data does not replicate (even DML against the setup_* tables in performance_schema), and that the above query will work with the default configuration with 5.6 (performance_schema and the statement instrumentation are both enabled by default there).