MySQL Master not sending events to another Master

MySQLmysql-5.6replication

We have Master-Master Replication and it is configured correctly as it transmit data for certain period of time and after that it does not send further events till we stop/start slave on both nodes.

When I run Show Slave Status\G, it displays

Seconds_Behind_Master as 0
Slave_SQL_Running as Yes
Slave_IO_Running as Yes
server_id is also different for both nodes.

My Server Info

MySQL Version is 5.6.23
Windows Server 2012

Any help would be great.

Best Answer

This could be one of two things

Slave I/O Thread is encountering Network Issues
Slave I/O is Playing Possum (pretending to be dead)

Derek Downey and I have written posts about the flaky behavior of the Slave's I/O Thread

Oct 17, 2011 : Master port is disabled from Public IP, MySQL log is not getting refreshed
Sep 30, 2011 : Can we capture only Slave_IO_Running in SHOW SLAVE STATUS in MySQL)

According to the Bug Reports on this phenomenon, it was supposedly fixed.

See (Bug #30703 and Bug #51089)

If this bug is happening to you, someone missed a patch making MySQL 5.6.23.

I have three suggestions

Drop back to MySQL 5.6.22 or 5.6.21 or last version of MySQL 5.6 where replication was working
Use MySQL 5.6's heartbeat option when running CHANGE MASTER TO.
Implement semisynchronous replication

As for the Network

Check Network For Dropped Packets
Check the NIC Card
Use separate NIC card for Replication Traffic

Other that what I have just said, sorry I don't have a silver bullet for this one :-(

SCENARIO #1

You may want to monitor Relay_Log_Space. It represents the some total of relay logs the Slave has to process. When should you worry about it?

Give this scenario

Slave_IO_Running is Yes
Slave_SQL_Running is No

This indicates that that the IO Thread is still downloading entries from its Master's binary logs. The longer you take to address the SQL error, the more the Relay_Log_Space will pile up. If the Master is in a heavy-write environment, relay logs will pile up quickly and puts the Slave at risk for disk space issues.

If you do not want to monitor Relay_Log_Space, you can always set the relay_log_space_limit. This force a cap on the amount of disk space allow for relay logs. For example, if you set this:

[mysqld]
relay_log_space_limit=8G

and restart mysql, this causes the IO Thread to stop collecting new entries from the Master when the sum of all relay log file sizes >= 8GB.. After the SQL Thread is done processing the oldest relay log, that relay log gets rotated out and the IO Thread starts getting entries from the Master where it left off. This provides a decent throttling mechanism for relay logs.

Another field you may want to look at is Last_SQL_Errno. Once Slave_IO_Running is Yes and Slave_SQL_Running is No, the Last_SQL_Errno will be set to something nonzero. On very rare occasions, it is possible to have this:

Slave_IO_Running is Yes
Slave_SQL_Running is No
Last_SQL_Errno is 0

When you have this unique phenomenon, mysqld has encountered an undocumented error (Don't laugh, I have seen this).

SCENARIO #2

Something else to keep in mind (not necessarily for monitoring)

Give this scenario

Slave_IO_Running is No
Slave_SQL_Running is No

Under what three circumstances would this occur?

Someone ran STOP SLAVE; by hand or in a backup script
The option skip-slave-start is in /etc/my.cnf and mysql was started
Issuing service mysql start --skip-slave-start or service mysql restart --skip-slave-start

Make sure not to monitor false positives should both replication threads are not running.

SCENARIO #3

Give this scenario

Slave_IO_Running is No
Slave_SQL_Running is Yes

This quickly indicates that there is a communication error because of one of the following

The Master went down
mysqld on the Master went down
Network Intermittency

MySQL slave replicates changes that are in neither binlog_do_db nor replicate_do_db

Replication filtering isn't bulletproof. Due to how the filtering is implemented the events responsible for your errors are being generated because the default database at query runtime is the my-database schema as expected and the query being executed is fully qualified INSERT INTO phpmyadmin.pma_column_info...

Peter Zaitsev explains the scenario well in this post:

Filtered MySQL Replication

Best Answer

Related Solutions

Mysql – Programatically testing slave status – What fields in SHOW SLAVE STATUS indicated slave status is OK / in ERROR

SCENARIO #1

SCENARIO #2

SCENARIO #3

MySQL slave replicates changes that are in neither binlog_do_db nor replicate_do_db

Related Question