MySQL – Level of Resilience Possible with MySQL Replication

MySQLreplication

I've found lots of good material on setting up MySQL replication, but not much on what to do in the event of failures. Understanding the degree of resilience I can achieve is fundamental to designing my system, so I'm not here looking for "howto" administrative advice, rather I want to understand the achievable semantics.

So, to simplify my requirements, I have an application that cannot function unless it can read a database. There are two scenarios: first, the common usage, many requests a minute, read the database return an answer. Second, less freaquently, update the database with new data. It's acceptble for the updates to be a few minutes delayed.

So my first thought: Master->Slave

Now the reader can use either Master or Slave, if we lose the Master for a while the reader can work against the Slave.

Sounds simple enough. But … what about more drastic problems, how manual is recovery? How long does it take? What data is lost?

Take this scenario: Master->Slave. We know that the slave is potantially a little out of synch with the Master. Now suppose we lose the master in a way that means it won't be back any time soon.

Now presumably we need the Slave to become the writable Master, and we'll need a new Slave.

Specific Questions:

  1. How much time and effort to make the Slave the Master – I failed to find docs on what to do. I'm guessing that this is pretty easy. Can we make this take-over seamless to client apps? Adjust DNS routing or some such?
  2. If we can't now get at the old Master's logs, then we have to accept that some updates to the master will never make it to the new Master, we do have data loss?
  3. How much effort to create the new Slave? My guess here is that this is not difficult but potentially takes time. I was trying to imagine reducing this overhead by having two Slaves, and adjusting the replication so that when Slave 1 becomes the new Master, Slave 2 now becomes the slave of that new Master. However given the potential delays in replaction I don't see that ensuring complete consistency is very easy.

Best Answer

I use master slave a lot for speeding up reads at remote offices at the end of slow (512 kbit) connections. My implementation and experience is as follows:

  1. Applications are written to read from the slave and write to the master. Reads inside a transaction say for a last_insert_id() need to be done from the master.

  2. In the event of an outage (say the broadband link goes down) reads continue from the slave transparently, but writes are not possible for that remote location. Other writes continue from other locations. The slaves keep updating as normal.

  3. When the link is restored, the slave reconnects and downloads any updates and synchronises itself, usually transparently.

This has been pretty successful for me where I have lots of people reading and only a few updating, most of whome are at head office.

You can have multiple slaves updating from the master and, if I remember correctly, a slave can also update from another slave so you can have "layers" of slaves.

The following is a useful read. http://dev.mysql.com/doc/refman/5.5/en/replication-solutions-switch.html