Mysql – Amazon RDS – and binary log rotation

amazon-rdsMySQLmysql-5.6

New to RDS (used to having my own server in my own data center).

We are on MySQL 5.6 on RDS.

when a slave gets behind in replication lag, the master ceases to rotate out its binary logs. It is, apparently, waiting to every slave to commit every transaction before rotating binary logs out. I don't have a problem with this conceptually (as long as you have monitors on disk space on the master, its a good thing).

My question is academic — what causes this to happen?
** I know with ASYNCHRONOUS replication, the master is unaware of the slaves' status

is this a function of synchronous replication? semi-synchronous replication?

I'd love some insight and perhaps a pointer to a white paper somewhere (I can not find one that addresses this specific issue of binary logs not being rotated out)

Best Answer

This doesn't appear to be anything intrinsic to MySQL Server, so I can't give you a proper citation on what "causes" this to happen -- but I strongly suspect it is an element of the design of RDS.

If you take a look at SHOW FULL PROCESSLIST; you'll notice there's always a user connected called "rdsadmin." If you look in the mysql schema, you'll spot a table called rds_heartbeat2, with a recent epoch time x 1000 stored in its single row. This is changed every few minutes, and my reasoned speculation is that that "rdsadmin" is doing this.

I selected a binlog at random to check this theory, and here's the first event:

use `mysql`; INSERT INTO mysql.rds_heartbeat2(id, value) values (1,1393029075007) 
ON DUPLICATE KEY UPDATE value = 1393029075007

I'll speculate, then, that the "rdsadmin" user is flushing the binlog, then immediately writing a new value to this table. Reading the value from this table on the slaves would give the RDS supervisory systems a mechanism for determining/monitoring the slave's behavior, allowing it to purge binlogs that it knows have been fully processed or let them linger on the master if they haven't, so that a slave would not be caught without the necessary logs being available.

Instead of letting MySQL age out it binlogs on its own based on the global varible expire_logs_days, RDS seems to be managing this process externally, and quite possibly this process is responsible for archiving the logs out of sight so that they can be used for the native RDS point-in-time restoration feature.

Related Solutions

Mysql – Looking for an efficient way to fix “Could not parse relay log event entry…” error

There is more stable approach you can try

Here is something to remember

Whenever you run CHANGE MASTER TO, it will erase every relay log you have. You do not want to keep relay logs of commands you have not executed any SQL on as of yet

The following is an excerpt taken from a post I made back on Feb 03, 2012 : How to resolve the master server shut down/unavailability in mysql with master - slave replication :

Please notice that there are two sets of replication coordinates from the Master

(Master_Log_File,Read_Master_Log_Pos)

(Relay_Master_Log_File,Exec_Master_Log_Pos)

There is a major difference between them

(Master_Log_File,Read_Master_Log_Pos) tells you the last binlog statement from the Master's log file and log position that the Slave read from the Master and placed in its Relay Logs.

(Relay_Master_Log_File,Exec_Master_Log_Pos) tells you the last binlog statement from the Master's log file and log position that the Slave read from the Master and placed in its Relay Logs THAT IS NEXT TO BE EXECUTED ON THE SLAVE.

What you want are two things:

Erase Every Binary Log You Have
Start Collecting Binary Log Entries From the Last SQL You Successfully Executed.

In your case, you must use the second set of Replication Coordinates

Relay_Master_Log_File
Exec_Master_Log_Pos

It is easy to distrust a corrupt relay log as shown in the error message. The one that hurts the most is a corrupt Master Log. You will have to jump through hoops if that is the case. On the other hand, if one of the other situations was the reason for the corrupt relay log, the simplest and most concise approach is what I stated.

To make sure, whatever is reported for Relay_Master_Log_File, if that particular binary log still exists on the Master, perform a mysqlbinlog on it. If it dumps in its entirety without corrupt characters, go ahead and use the second set of replication coordinates.

From my same earlier post

mysql> show slave status\G
*************************** 1. row ***************************
             Slave_IO_State: Waiting for master to send event
                Master_Host: 10.48.20.253
                Master_User: replicant
                Master_Port: 3306
              Connect_Retry: 60
            Master_Log_File: mysql-bin.000254
        Read_Master_Log_Pos: 858190247
             Relay_Log_File: relay-bin.066069
              Relay_Log_Pos: 873918
      Relay_Master_Log_File: mysql-bin.000254
           Slave_IO_Running: Yes
          Slave_SQL_Running: Yes
            Replicate_Do_DB:
        Replicate_Ignore_DB:
         Replicate_Do_Table:
     Replicate_Ignore_Table:
    Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
                 Last_Errno: 0
                 Last_Error:
               Skip_Counter: 0
        Exec_Master_Log_Pos: 858190247
            Relay_Log_Space: 873772
            Until_Condition: None
             Until_Log_File:
              Until_Log_Pos: 0
         Master_SSL_Allowed: No
         Master_SSL_CA_File:
         Master_SSL_CA_Path:
            Master_SSL_Cert:
          Master_SSL_Cipher:
             Master_SSL_Key:
      Seconds_Behind_Master: 0
1 row in set (0.00 sec)

notice that the Replication Coordinates from SHOW SLAVE STATUS\G for what was last executed are (mysql-bin.000254,858190247). The CHANGE MASTER TO command in this case would be:

CHANGE MASTER TO master_log_file='mysql-bin.000254',master_log_pos=858190247;

Give it a Try !!!

UPDATE 2012-09-14 16:38 EDT

If you worried about the stockpiling relay logs, just throttle the relay logs. In SHOW SLAVE STATUS\G, there is a field called Relay_Log_Space. That gives you the sum of all relay sizes in bytes. Did you know you could put a cap on that number ?

The option is called relay_log_space_limit.

For example, if you want to cap the total number of bytes to 10G, do the following

STEP 01) Add this to /etc/my.cnf on the Slave

[mysqld]
relay_log_space_limit = 10G

STEP 02) Run service mysql restart on the Slave

and that's it !!!

When the oldest relay has all its entries processed, it is deleted and a new relay log is created. That gets filled until all relay logs add up to 10G. That's the only way to control runaway relay log space issues.

UPDATE 2012-09-14 18:10 EDT

SUGGESTION : If you make mysqldump backups of the data on the Slave every midnight, you could set up the following to restrict having 1TB of binary logs:

STEP 01) Add this to /etc/my.cnf on the Master

[mysqld]
expire_logs_days = 14

STEP 02) Run this query on the Master

mysql> PURGE BINARY LOGS BEFORE DATE(NOW()) - INTERVAL 14 DAY;

STEP 03) service mysql restart on the Master

STEP 04) Add a mysqldump backup script to a crontab on the Slave

This will make the Slave more useful and would control having excess binary logs to worry about

MySQL: Does Amazon’s RDS Multi-AZ feature use binary logs

Yes it does uses mysql binary logs for Multi-AZ replication.

Here this FAQ at amazon site confirms this:

Q: Can I directly access the binary logs for my Database Instance to manage my own replication?
A. Amazon RDS does not currently provide access to the binary logs for your Database Instance.

http://aws.amazon.com/rds/faqs/#103

And this one too.

You may find in some cases that your Read Replica(s) aren’t able to receive or apply updates from their source Multi-AZ DB Instance after a Mulit-AZ failover. This is because some MySQL binlog events were not flushed to disk at the time of the failover. After the failover, the Read Replica may ask for binlogs from the source that it doesn’t have. This loss of MySQL binlogs during a crash is described in the MySQL document here.

http://aws.amazon.com/rds/faqs/#107

This simply confirms that it does uses binary logs.