MySQL how to change a slave to a new master

MySQLreplication

I have a MySQL active master (lets call it Alpha), a passive master (called Beta), and a slave (called Gamma).

Topology before change:

enter image description here

I am going to switch the active master and the passive master.

I will then need to also change the slave (Gamma) to the new active master (Beta). I am following the instructions in the book "Effective MySQL: Replication Techniques in Depth", page 97, which states before slave (Gamma) is pointed to the new active master (Beta) I should run:

beta> STOP SLAVE SQL_THREAD;
beta> SHOW SLAVE STATUS\G
beta> SHOW MASTER STATUS\G
beta> START SLAVE SQL_THREAD;

It says use the log position from that SHOW SLAVE STATUS on Beta, and use it to run on Gamma:

gamma> START SLAVE UNTIL MASTER_LOG_FILE='{file}', MASTER_LOG_POS={position}

My question is why is that step necessary. I am planning to stop all application writes to the current active master (Alpha), wait as long as necessary for Beta and Gamma replication to fully catch up.

Since they are both caught up with master (Alpha), is that command "START SLAVE UNTIL…" really needed?

Best Answer

You're correct. If the slaves are in sync to the same point on Alpha, START SLAVE UNTIL would serve no purpose.

Effectively, they're having you do is stop the slave on Gamma so that it is at the same point in Alpha's binlog as Beta. In a live, high traffic condition, that can be slightly trickier to do, so I assume that's their motivation for suggesting that approach.

If Gamma and Beta have identical Relay_Master_Log_File and Exec_Master_Log_Pos values, then they are at the same point relative to Alpha, and you can safely switch Gamma to Beta at Beta's current master log file and position.

However... to state the (presumably) obvious, you can't be doing any writing on Beta (other than via replication) around the time you make the cut, because if you're writing to Beta, then Gamma would miss any events that hadn't replicated from Beta to Gamma via Alpha... but the START SLAVE UNTIL wouldn't help you with that, either.

This means you need to actually verify that all three servers are in agreement in all possible directions, so there's nothing outstanding from Beta back to Alpha. Alpha's Exec_Master_Log_Pos (SHOW SLAVE STATUS) needs to match Beta's Position (SHOW MASTER STATUS) and both Gamma and Beta need to match Alpha and each other in the opposite direction (and of course the actual names of the binlogs need to match, not just the positions).

If you want to be super safe and paranoid, verify that everything is in agreement, then issue FLUSH TABLES WITH READ LOCK; on both Alpha and Beta (leaving the connections open to hold that global lock, which is yours once the prompt returns), then verify that everything remains in agreement at the same binlog positions with nothing written to either master server's binlog that the other hasn't already played back. Stop, re-home and restart replication on Gamma, verifying that it is connected and waiting for incoming events. Finally, verify agreement has persisted in each direction between Alpha and Beta once again, then UNLOCK TABLES; on Alpha and Beta.

Note that above, I never stopped the slave on Beta. It's not really necessary if you have a global read lock, since the slave is stalled and binlog coordinates can't advance under that condition, but you could stop it if desired, so long as the coordinates continue to match Alpha.

Related Solutions

Mysql – Getting slaves of a master-master setup stopped in sync

All of these approaches show that you gave these things a lot of thought.

You are worried about any pending changes when running FLUSH TABLES WITH READ LOCK;.

Think about this: When you issue FLUSH TABLES WITH READ LOCK;, how is replication affected? Recall that replication has two threads

IO Thread
SQL Thread

The IO Thread is responsible for communication between Master and Slave. It downloads binary log entries from the Master and stores them in the Slave's relay logs.

The SQL Thread is responsible for

reading the next SQL statement from the Slave's relay logs and processing them
maintain are temp tables created within the session of the SQL Thread

When you run FLUSH TABLES WITH READ LOCK;, only the SQL Thread gets affected because it needs to connect to tables. The IO Thread can still collect binary log entries from the Master and store them in the Slave's relay logs. Any replication lag will simply be caught off guard as is. In light of this, STOP SLAVE; should be faster than FLUSH TABLES WITH READ LOCK;. If you are concerned about pending changes, then use STOP SLAVE SQL_THREAD; instead of STOP SLAVE;. That way, whatever is last executed on each Master should be checked.

When you do SHOW SLAVE STATUS\G look for two lines

Relay_Master_Log_File (line 10)
Exec_Master_Log_Pos (line 22)

This tells you what was the SQL statement downloaded to the Slave that was last executed.

Knowing this, you could try the following

Step 01 : On M1 and M2, STOP SLAVE SQL_THREAD;
Step 02 : Run SHOW MASTER STATUS; on M1 and M2
Step 03 : Run SHOW SLAVE STATUS\G on M1 and M2
Step 04 : Evaluate this condition
- Does M1's File = M2's Relay_Master_Log_File ?
- Does M2's File = M1's Relay_Master_Log_File ?
- Does M1's Position = M2's Exec_Master_Log_Pos ?
- Does M2's Position = M1's Exec_Master_Log_Pos ?
Step 05 : If any one of the four conditions in Step 04 is not met
- On M1 and M2, START SLAVE SQL_THREAD;
- SELECT SLEEP(30);
- Go Back to Step 01

If you get past Step 05 with all four conditions in Step 04, M1 and M2 are in sync.

Once M1 and M2 are frozen simultaneously

S1 should match M1
- Wait until S1's Seconds_Behind_Master = 0
- M1's File = S1's Relay_Master_Log_File
- M1's Position = S1's Exec_Master_Log_Pos
S2 should match M2
- Wait until S2's Seconds_Behind_Master = 0
- M2's File = S2's Relay_Master_Log_File
- M2's Position = S2's Exec_Master_Log_Pos
No need to run STOP SLAVE; on S1 or S2

I hope this helps

UPDATE 2012-05-11 17:30 EDT

Once S1 and S2 match up with their respective Master, you could STOP SLAVE; if you want to. Since M1 and M2 are frozen, no other changes can reach S1 or S2. Thus, STOP SLAVE; is not a requirement but you do so anyway.

UPDATE 2012-05-11 21:29 EDT

Your Comment

M1/M2 are frozen from receiving updates from one another but not from receiving a legit update from an external client/application, no?

Are you still accepting incoming feeds? You did say in the original question

As I try thinking this out I keep running into gotchas that won't quite work out.

That would certainly be one gotcha. Therefore, discontinue incoming feeds.

Since you want to do FLUSH TABLES WITH READ LOCK; to M1 and M2, I have one recommendation. Please set this one hour before syncing everything:

SET GLOBAL innodb_max_dirty_pages_pct = 0;

This will clear all dirty pages from the InnoDB Buffer Pool. That way, the time for FLUSH TABLES WITH READ LOCK; is as fast as possible. When all syncing is done, set it back to 90 (if running MySQL 5.5) or 75 (otherwise).

Your Comment

I could see how M1/M2 were locked if they flushed w/ read lock but it seemed your steps were not including such a step

I was not including such a step because I was under the impression you would disable outside feeds.

Mysql – Create a thesql slave database from master database

Going by your question, I will like to review what I believe you did thus far:

You stopped mysql on the Master
You copied Master's /var/lib/mysql to the Slave's /var/lib/mysql
I surmise the binlogs on the Master were copied as well

Look at the Slave's last binlog. From the question, it should be

mydbm1-bin.008524
Filesize 1330529

Believe it or not, you have to do a few things:

1) On the Master, create a replication user like this:

GRANT REPLICATION SLAVE,REPLICATION CLIENT
ON *.* TO replicator@'%'
IDENTIFIED BY 'r3plic4t0R';

2) Make /var/lib/mysql on the Slave owned by mysql user

chown -R mysql:mysql /var/lib/mysql

3) Make sure Master's server_id is explicitly set in my.cnf

[mysqld]
server_id = 1

4) Make sure Slave's server_id is explicitly set in my.cnf

[mysqld]
server_id = 2

5) Startup mysql on the Slave

service mysql start

6) Setup replication by running this on the Slave

CHANGE MASTER TO
MASTER_HOST='IPAddressOfMaster',
MASTER_PORT=3306,
MASTER_USER='replicator',
MASTER_PASSWORD='r3plic4t0R',
MASTER_LOG_FILE='mydbm1-bin.008524',
MASTER_LOG_POS=1330529;
START SLAVE;
SELECT SLEEP(5);
SHOW SLAVE STATUS\G

You will see something like this:

mysql> show slave status\G
*************************** 1. row ***************************
             Slave_IO_State: Waiting for master to send event
                Master_Host: 10.48.20.253
                Master_User: replicant
                Master_Port: 3306
              Connect_Retry: 60
            Master_Log_File: mysql-bin.000254
        Read_Master_Log_Pos: 858190247
             Relay_Log_File: relay-bin.066069
              Relay_Log_Pos: 873918
      Relay_Master_Log_File: mysql-bin.000254
           Slave_IO_Running: Yes
          Slave_SQL_Running: Yes
            Replicate_Do_DB:
        Replicate_Ignore_DB:
         Replicate_Do_Table:
     Replicate_Ignore_Table:
    Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
                 Last_Errno: 0
                 Last_Error:
               Skip_Counter: 0
        Exec_Master_Log_Pos: 858190247
            Relay_Log_Space: 873772
            Until_Condition: None
             Until_Log_File:
              Until_Log_Pos: 0
         Master_SSL_Allowed: No
         Master_SSL_CA_File:
         Master_SSL_CA_Path:
            Master_SSL_Cert:
          Master_SSL_Cipher:
             Master_SSL_Key:
      Seconds_Behind_Master: 0
1 row in set (0.00 sec)

If Slave_IO_Running and Slave_SQL_Running are both Yes, CONGRATULATIONS !!!

I already answered a post back on Feb 06, 2012 ( How to setup replication(Master/slave) in MySQL 5.5.20? ) with essentially the same steps.

I wanted to add additional posts I made for setting up Circular Replication should you decide to setup the two DB servers as Master/Master

Sep 24, 2012 : Setting up MySQL circular replication in existing replication topology?
May 07, 2012 : Setting Circular Replication in mysql

Best Answer

Related Solutions

Mysql – Getting slaves of a master-master setup stopped in sync

UPDATE 2012-05-11 17:30 EDT

UPDATE 2012-05-11 21:29 EDT

Your Comment

Your Comment

Mysql – Create a thesql slave database from master database

Related Question