Given the fact that you mentioned you have Master-Master replication mode
, I would not recommend any automatic failover unless you properly account for Replication Lag. After all, MySQL Replication is asynchronous.
It is theoretically possible to have the following:
DBServer1
as Master to DBServer2
DBServer2
as Master to DBServer1
- DBVIP pointing at DBServer1
DBServer2
is 180 seconds behind
DBServer1
goes down
- Automatic Failover moves DBVIP to
DBServer2
With this scenario, DBServer2
could have auto increment keys that do not exist yet. Upon failover, the DBVIP will allow WebServers to connect to DBServer2
and ask for data that does not exist yet.
This would therefore require background processes running on each DBServer.
For the above scenario:
- DBVIP is on
DBServer1
DBServer1
runs HeartBeat
DBServer2
runs HeartBeat
- Background Process on
DBServer1
to monitor
- a) Data Mount Availability
- b) Data Mount Writeability
- c) MySQL Connectivity
- Once a,b, or c fail, kill HeartBeat
Background Process on DBServer2
to make sure DBVIP is pingable
What should killing HeartBeat do? Trigger the startup script defined for it.
What should the startup script on DBServer2
look for?
- Loop until DBVIP is unreachable via ping
- Connect to MySQL and
- Run
SHOW SLAVE STATUS\G
in a Loop until Seconds_Behind_Master
is NULL
- RUn
SHOW SLAVE STATUS\G
in a Loop until `Exec_Master_Log_Pos stops changing
- Assign DBVIP to
DBServer2
via ip addr add
This is essentailly the algorithm for failing over safely to a Passive Master in a Master/Master Replication Cluster.
ALTERNATIVE
If ALL your data is InnoDB, I recommend something with less rigor. Perhaps you should look into using DRBD and HeartBeat. Here is why:
DRBD provides network RAID-1 for a Block Device on two servers.
You would essentially do this:
- Have
DBServer1's
DRBD Block Device as Primary
- Have
DBServer2's
DRBD Block Device as Secondary
- Mount
DBServer1's
DRBD Device on /var/lib/mysql
- Startup MySQL on
DBServer1
- Have HeartBeat Monitor Ping Activity Between Servers
What would startup script look like in a DRBD scenario?
- Loop until DBVIP is unreachable via ping
- Kill HeartBeart
- Disconnect DRBD
- Promote DRBD to Primary
- Mount DRBD on /var/lib/mysql
- Start MySQL (InnoDB Crash Recovery Fills in Missing Data)
- Assign DBVIP via
ip addr add
This is a lot more straightforward because only one side is Active. The Passive side (DRBD Secondary) is a Synchronous Disk Copy of the Active Side (DRBD Primary).
CAVEAT
If all or most of the working set data is MyISAM, do not touch DRBD. Crash scnearios quickly result in MyISAM tables being marked crashed and need auto-repair (which can be paintfully slow to wait for).
UPDATE 2012-12-29 08:00 EDT
Here are my past posts on using DRBD with MySQL
Within your SQL Server Agent job, have some conditional logic to test for if the current instance is serving the particular role you are looking for on you availability group:
if (select
ars.role_desc
from sys.dm_hadr_availability_replica_states ars
inner join sys.availability_groups ag
on ars.group_id = ag.group_id
where ag.name = 'YourAvailabilityGroupName'
and ars.is_local = 1) = 'PRIMARY'
begin
-- this server is the primary replica, do something here
end
else
begin
-- this server is not the primary replica, (optional) do something here
end
All this does is pull the current role of the local replica, and if it's in the PRIMARY
role, you can do whatever it is that your job needs to do if it is the primary replica. The ELSE
block is optional, but it's to handle possible logic if your local replica isn't primary.
Of course, change 'YourAvailabilityGroupName'
in the above query to your actual availability group name.
Don't confuse availability groups with failover cluster instances. Whether the instance is the primary or secondary replica for a given availability group doesn't affect server-level objects, like SQL Server Agent jobs and so on.
Best Answer
SIOS makes an excellent product that you can use without every supported version of SQL Server and is also SANless. It does block by block transfers and David Klee is a big enough fan that he even does sessions for them, and doesn't get remuneration for it. He just loves the product that much.
I'm not sure if it meets all of your needs but it's definitely worth checking out.
http://us.sios.com/