Mysql – How to know if MySQL replication is working

MySQLmysql-5.5replication

I am replicating from a MySQL v 5.5.50 master, to a MySQL v 5.5.46 slave hosted in RDS. I've followed the AWS documentation on how to do this to the letter, got the dump file imported into the slave server and ran the CALL mysql.rds_set_external_master... command to begin things a few days ago.

My SHOW MASTER STATUS; result is:

+---------------------+----------+--------------+------------------+
| File                | Position | Binlog_Do_DB | Binlog_Ignore_DB |
+---------------------+----------+--------------+------------------+
| mysql-binlog.000023 | 48684053 |              |                  |
+---------------------+----------+--------------+------------------+
1 row in set (0.00 sec)

And my SHOW SLAVE STATUS \G; result is:

*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: [master host IP here]
                  Master_User: [local replication user here]
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysql-binlog.000023
          Read_Master_Log_Pos: 53052514
               Relay_Log_File: relaylog.000007
                Relay_Log_Pos: 81145300
        Relay_Master_Log_File: mysql-binlog.000017
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB:
          Replicate_Ignore_DB:
           Replicate_Do_Table:
       Replicate_Ignore_Table: mysql.rds_sysinfo,mysql.rds_history,mysql.rds_replication_status
      Replicate_Wild_Do_Table:
  Replicate_Wild_Ignore_Table:
                   Last_Errno: 0
                   Last_Error:
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 1016905328
              Relay_Log_Space: 5695987840
              Until_Condition: None
               Until_Log_File:
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File:
           Master_SSL_CA_Path:
              Master_SSL_Cert:
            Master_SSL_Cipher:
               Master_SSL_Key:
        Seconds_Behind_Master: 394467
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error:
               Last_SQL_Errno: 0
               Last_SQL_Error:
  Replicate_Ignore_Server_Ids:
             Master_Server_Id: 21033229
1 row in set (0.12 sec)

And finally, the only two relevant lines (that aren't SLEEP from my client) visible in SHOW FULL PROCESSLIST; are:

*************************** 2. row ***************************
     Id: 207
   User: system user
   Host:
     db: NULL
Command: Connect
   Time: 361292
  State: Waiting for master to send event
   Info: NULL
*************************** 3. row ***************************
     Id: 208
   User: system user
   Host:
     db: NULL
Command: Connect
   Time: 394904
  State: Reading event from the relay log
   Info: NULL

I'm trying to confirm whether or not things are actually replicating, because the Seconds_Behind_Master value is getting bigger and bigger. However, the Master_Log_File value has gone up to match that of the master.

Exec_Master_Log_Pos and Relay_Log_Pos haven't changed in a long* time. However, Read_Master_Log_Pos is growing all the time.

I find no errors in the log files that RDS provides to me, no warnings or errors when the process started.

Is there a way I can tell if there has been some kind of error, and how can I look to recover from it?

(* Not 100% sure on how long that is, but measured in hours, not seconds)

UPDATE

I ran a slave stop and then slave start command, and in the log, I got the following:

160817 10:39:04 [Note] Slave SQL thread exiting, replication stopped in log 'mysql-binlog.000018' at position 72335184
160817 10:39:04 [ERROR] Error reading packet from server: Lost connection to MySQL server during query ( server_errno=2013)
160817 10:39:04 [Note] Slave I/O thread killed while reading event
160817 10:39:04 [Note] Slave I/O thread exiting, read up to log 'mysql-binlog.000025', position 74416238
160817 10:39:22 [Note] Slave SQL thread initialized, starting replication in log 'mysql-binlog.000018' at position 72335184, relay log '/rdsdbdata/log/relaylog/relaylog.000010' position: 72335333
160817 10:39:22 [Note] Slave I/O thread: connected to master '[replication user]@[server ip]:3306',replication started in log 'mysql-binlog.000025' at position 74416238

So nothing untoward there that I can see. The Seconds_Behind_Master is still going up and still no individual commands are displaying in my process list etc.

Best Answer

The two ...Running values say YES; that is usually sufficient to say all is well. However Seconds_Behind_Master: 394467 is scary. That 11 hours!

Run a test: On the Master do CREATE DATABASE dummy;, then see if it is there on the Slave. (You'll probably need to be root both times.)

There are situations where Seconds_Behind_Master bounces between 0 and some big value. That is a fluke. Watch it for a while to see that is the case.