Mysql – Syncing updates to master in master/slave setup [semi-sync replication]

MySQLsemi-sync-replication

Background info

I have two MySQL 5.5 servers set up in a Master/Slave configuration with Semi-synchronous replication.

The database is being used for a high-profile WordPress site.

Problem

It seems to work great except for one thing:

When the master is down and changes are made to the slave, they are not written back to the master when it comes back up.

What I've done so far

I tried the answer [here], but it broke my replication completely.

Questions

Is there a more standard way of setting this up or is it generally not supported?
Should I just disallow writes on the slave?
Is there a simple way to notify the master of changes, and sync them upstream?

Best Answer

You are almost there. I guess you have missed the auto_increment_increment & auto_increment_offset variables.

Set auto increment increment to the total number of servers.

Set auto_increment_offset to a number between 1 and auto increment increment for each server.

example:

Server 1: auto increment increment = 2, auto_increment_offset = 1

Server 2: auto increment increment = 2, auto_increment_offset = 2

Related Solutions

Mysql – Getting slaves of a master-master setup stopped in sync

All of these approaches show that you gave these things a lot of thought.

You are worried about any pending changes when running FLUSH TABLES WITH READ LOCK;.

Think about this: When you issue FLUSH TABLES WITH READ LOCK;, how is replication affected? Recall that replication has two threads

IO Thread
SQL Thread

The IO Thread is responsible for communication between Master and Slave. It downloads binary log entries from the Master and stores them in the Slave's relay logs.

The SQL Thread is responsible for

reading the next SQL statement from the Slave's relay logs and processing them
maintain are temp tables created within the session of the SQL Thread

When you run FLUSH TABLES WITH READ LOCK;, only the SQL Thread gets affected because it needs to connect to tables. The IO Thread can still collect binary log entries from the Master and store them in the Slave's relay logs. Any replication lag will simply be caught off guard as is. In light of this, STOP SLAVE; should be faster than FLUSH TABLES WITH READ LOCK;. If you are concerned about pending changes, then use STOP SLAVE SQL_THREAD; instead of STOP SLAVE;. That way, whatever is last executed on each Master should be checked.

When you do SHOW SLAVE STATUS\G look for two lines

Relay_Master_Log_File (line 10)
Exec_Master_Log_Pos (line 22)

This tells you what was the SQL statement downloaded to the Slave that was last executed.

Knowing this, you could try the following

Step 01 : On M1 and M2, STOP SLAVE SQL_THREAD;
Step 02 : Run SHOW MASTER STATUS; on M1 and M2
Step 03 : Run SHOW SLAVE STATUS\G on M1 and M2
Step 04 : Evaluate this condition
- Does M1's File = M2's Relay_Master_Log_File ?
- Does M2's File = M1's Relay_Master_Log_File ?
- Does M1's Position = M2's Exec_Master_Log_Pos ?
- Does M2's Position = M1's Exec_Master_Log_Pos ?
Step 05 : If any one of the four conditions in Step 04 is not met
- On M1 and M2, START SLAVE SQL_THREAD;
- SELECT SLEEP(30);
- Go Back to Step 01

If you get past Step 05 with all four conditions in Step 04, M1 and M2 are in sync.

Once M1 and M2 are frozen simultaneously

S1 should match M1
- Wait until S1's Seconds_Behind_Master = 0
- M1's File = S1's Relay_Master_Log_File
- M1's Position = S1's Exec_Master_Log_Pos
S2 should match M2
- Wait until S2's Seconds_Behind_Master = 0
- M2's File = S2's Relay_Master_Log_File
- M2's Position = S2's Exec_Master_Log_Pos
No need to run STOP SLAVE; on S1 or S2

I hope this helps

UPDATE 2012-05-11 17:30 EDT

Once S1 and S2 match up with their respective Master, you could STOP SLAVE; if you want to. Since M1 and M2 are frozen, no other changes can reach S1 or S2. Thus, STOP SLAVE; is not a requirement but you do so anyway.

UPDATE 2012-05-11 21:29 EDT

Your Comment

M1/M2 are frozen from receiving updates from one another but not from receiving a legit update from an external client/application, no?

Are you still accepting incoming feeds? You did say in the original question

As I try thinking this out I keep running into gotchas that won't quite work out.

That would certainly be one gotcha. Therefore, discontinue incoming feeds.

Since you want to do FLUSH TABLES WITH READ LOCK; to M1 and M2, I have one recommendation. Please set this one hour before syncing everything:

SET GLOBAL innodb_max_dirty_pages_pct = 0;

This will clear all dirty pages from the InnoDB Buffer Pool. That way, the time for FLUSH TABLES WITH READ LOCK; is as fast as possible. When all syncing is done, set it back to 90 (if running MySQL 5.5) or 75 (otherwise).

Your Comment

I could see how M1/M2 were locked if they flushed w/ read lock but it seemed your steps were not including such a step

I was not including such a step because I was under the impression you would disable outside feeds.

Mysql – Breaking Semisynchronous Replication in MySQL 5.5

The one thing you can probably do is to increase the sensitivity of the acknowledgement

Please look for this variables in /etc/my.cnf

[mysqld]
rpl_semi_sync_master_timeout=5000

When rpl_semi_sync_master_timeout (Default is 10000 or 10 seconds) is set to 5000, Replication should switch a Master from SemiSync to Async if no acknowledgement is received from the Slave in 5 seconds (5000 milliseconds). You may want to lower this even less than 5000.

You may need to check your network performance. As long as you not doing geographic distance replication, you should the following: On a Separate NIC, use a crossover cable over 192.168.xx.xx so as not have faster replication response. Also, check the switch for dropped packets.

You should not have to restart the server every time. As a quick-and-dirty band-aid, make up this SQL script:

STOP SLAVE IO_THREAD;
SELECT SLEEP(5);
START SLAVE IO_THREAD;

Simply write a cronjob to execute these commands every 5 minutes. Again, this is a band-aid. Otherwise, due diligence on the networking side is in order.

Background info

Problem

What I've done so far

Questions

Best Answer

Related Solutions

Mysql – Getting slaves of a master-master setup stopped in sync

UPDATE 2012-05-11 17:30 EDT

UPDATE 2012-05-11 21:29 EDT

Your Comment

Your Comment

Mysql – Breaking Semisynchronous Replication in MySQL 5.5

Related Question