Mysql – How to make MySQL replication reliable

innodbmyisamMySQLreplication

  • Master version: 5.5.13-1
  • Slave version: 5.5.14-1
  • Binary log format: MIXED

My Slave database (~ 40GB) has been out of sync from the Master. I cannot find anything interesting in the error log. Google gives me a very helpful link.

I'm going to re-sync the database follow this instruction to minimal downtime on the Master. But before doing this, I just want to make sure that this situation is limited in the future. I will scan through the parts in the above to show you what I've done:

  • Slave database was configured with read-only option
  • There is some unsafe queries. Does it get some problems with MIXED
    based replication?
  • I replicated all databases
  • I used both InnoDB and MyISAM storage engines
  • Developers use alot of temporary tables

Should I:

  • Don't use the unsafe queries
  • Ask developers put all the temporary tables into a separated database

Is there anything else? In case of out of sync, is mk-table-sync reliable enough to re-sync automatically? Does anyone use it on production?


UPDATE: Tue Feb 28 23:27:13 ICT 2012

My Slave database (~ 40GB) has been out of sync from the Master. I
cannot find anything interesting in the error log.

To get more information about what was happening, the Slave should be started with --log-warnings=2.

Best Answer

OBSERVATION #1

You mentioned Ask developers put all the temporary tables into a separated database

If your developers are using CREATE TEMPORARY TABLE commands to create temporary tables, they need to use CREATE TABLE instead. Here is why:

With MySQL Replication processing a temporary table, this is what occurs

  • 1) Master run CREATE TEMPORARY TABLE
  • 2) Command inserted into binary log
  • 3) Replication copies this over to the Slave's Relay Logs via I/O Thread
  • 4) Slave SQL Thread runs CREATE TEMPORARY TABLE
  • 5) Slave processes data with that temp table

Once in a while, someone may run STOP SLAVE; to run a backup. If STOP SLAVE; is issued just after step 4, the temp created disappears and so does its data. When you run START SLAVE; Replication breaks instantly complaining the table does not exist. This is normal because when a DB Connections terminates deliberately or accidently, all temp tables opened using CREATE TEMPORARY TABLE in the DB session are dropped. Running STOP SLAVE; kill the SQL thread who was holding opening the temp table.

The only workaround for this is to create the table using CREATE TABLE instead of CREATE TEMPORARY TABLE. When run STOP SLAVE;, the temp table you created normally does not disappear.

I have seen this happen maybe 10 times in my DBA career. Fixing it using the binary logs to find out the name of the temp tables, to create those tables using CREATE TABLE, then starting replication up was the only maintenance possible without just brute force copying the master.

OBSERVATION #2

mk-table-sync only works on tables with primary keys and/or unique keys. It works maybe 99% of the time. I have seen instances where the checksum of a table on the master and slave were different. I would run mk-table-sync, there were still differences (Of course, I was doing mk-table-sync in circular replication with 3 masters, which can be a little dangerous. Using it in Master/Slave is far more stable)

OBSERVATION #3

You mentioned There is some unsafe queries. Does it get some problems with MIXED based replication?

It depends. The most popular unsafe query is any UPDATE or DELETE that uses ORDER BY ... LIMIT. With SBR, this could possibly cause MySQL to UPDATE or DELETE rows from a table on the Slave in a different order tham that of the Master. With RBR, I believe the exact changes in a row are more identifiable to UPDATE or DELETE on the Slave.

SOLUTION : Avoid using unsafe queries. Then, you will not worry !!!

OBSERVATION #4

I just read your second link. ROFL !!! I am familiar with the poster of the answer.