There is more stable approach you can try
Here is something to remember
Whenever you run CHANGE MASTER TO
, it will erase every relay log you have. You do not want to keep relay logs of commands you have not executed any SQL on as of yet
The following is an excerpt taken from a post I made back on Feb 03, 2012 : How to resolve the master server shut down/unavailability in mysql with master - slave replication :
Please notice that there are two sets of replication coordinates from
the Master
- (Master_Log_File,Read_Master_Log_Pos)
- (Relay_Master_Log_File,Exec_Master_Log_Pos)
There is a major difference between them
(Master_Log_File,Read_Master_Log_Pos)
tells you the last binlog statement from the Master's log file and log position that the Slave
read from the Master and placed in its Relay Logs.
(Relay_Master_Log_File,Exec_Master_Log_Pos)
tells you the last binlog statement from the Master's log file and log position that the
Slave read from the Master and placed in its Relay Logs THAT IS NEXT
TO BE EXECUTED ON THE SLAVE.
What you want are two things:
- Erase Every Binary Log You Have
- Start Collecting Binary Log Entries From the Last SQL You Successfully Executed.
In your case, you must use the second set of Replication Coordinates
Relay_Master_Log_File
Exec_Master_Log_Pos
It is easy to distrust a corrupt relay log as shown in the error message. The one that hurts the most is a corrupt Master Log. You will have to jump through hoops if that is the case. On the other hand, if one of the other situations was the reason for the corrupt relay log, the simplest and most concise approach is what I stated.
To make sure, whatever is reported for Relay_Master_Log_File
, if that particular binary log still exists on the Master, perform a mysqlbinlog on it. If it dumps in its entirety without corrupt characters, go ahead and use the second set of replication coordinates.
From my same earlier post
mysql> show slave status\G
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 10.48.20.253
Master_User: replicant
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.000254
Read_Master_Log_Pos: 858190247
Relay_Log_File: relay-bin.066069
Relay_Log_Pos: 873918
Relay_Master_Log_File: mysql-bin.000254
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 858190247
Relay_Log_Space: 873772
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: 0
1 row in set (0.00 sec)
notice that the Replication Coordinates from SHOW SLAVE STATUS\G
for what was last executed are (mysql-bin.000254,858190247)
. The CHANGE MASTER TO
command in this case would be:
CHANGE MASTER TO master_log_file='mysql-bin.000254',master_log_pos=858190247;
Give it a Try !!!
UPDATE 2012-09-14 16:38 EDT
If you worried about the stockpiling relay logs, just throttle the relay logs. In SHOW SLAVE STATUS\G
, there is a field called Relay_Log_Space
. That gives you the sum of all relay sizes in bytes. Did you know you could put a cap on that number ?
The option is called relay_log_space_limit.
For example, if you want to cap the total number of bytes to 10G, do the following
STEP 01) Add this to /etc/my.cnf on the Slave
[mysqld]
relay_log_space_limit = 10G
STEP 02) Run service mysql restart
on the Slave
and that's it !!!
When the oldest relay has all its entries processed, it is deleted and a new relay log is created. That gets filled until all relay logs add up to 10G. That's the only way to control runaway relay log space issues.
UPDATE 2012-09-14 18:10 EDT
SUGGESTION : If you make mysqldump backups of the data on the Slave every midnight, you could set up the following to restrict having 1TB of binary logs:
STEP 01) Add this to /etc/my.cnf on the Master
[mysqld]
expire_logs_days = 14
STEP 02) Run this query on the Master
mysql> PURGE BINARY LOGS BEFORE DATE(NOW()) - INTERVAL 14 DAY;
STEP 03) service mysql restart
on the Master
STEP 04) Add a mysqldump backup script to a crontab on the Slave
This will make the Slave more useful and would control having excess binary logs to worry about
Yes it does uses mysql binary logs for Multi-AZ replication.
Here this FAQ at amazon site confirms this:
Q: Can I directly access the binary logs for my Database Instance to manage my own replication?
A. Amazon RDS does not currently provide access to the binary logs for your Database Instance.
http://aws.amazon.com/rds/faqs/#103
And this one too.
You may find in some cases that your Read Replica(s) aren’t able to
receive or apply updates from their source Multi-AZ DB Instance after
a Mulit-AZ failover. This is because some MySQL binlog events were not
flushed to disk at the time of the failover. After the failover, the
Read Replica may ask for binlogs from the source that it doesn’t have.
This loss of MySQL binlogs during a crash is described in the MySQL
document here.
http://aws.amazon.com/rds/faqs/#107
This simply confirms that it does uses binary logs.
Best Answer
This doesn't appear to be anything intrinsic to MySQL Server, so I can't give you a proper citation on what "causes" this to happen -- but I strongly suspect it is an element of the design of RDS.
If you take a look at
SHOW FULL PROCESSLIST;
you'll notice there's always a user connected called "rdsadmin." If you look in themysql
schema, you'll spot a table calledrds_heartbeat2
, with a recent epoch time x 1000 stored in its single row. This is changed every few minutes, and my reasoned speculation is that that "rdsadmin" is doing this.I selected a binlog at random to check this theory, and here's the first event:
I'll speculate, then, that the "rdsadmin" user is flushing the binlog, then immediately writing a new value to this table. Reading the value from this table on the slaves would give the RDS supervisory systems a mechanism for determining/monitoring the slave's behavior, allowing it to purge binlogs that it knows have been fully processed or let them linger on the master if they haven't, so that a slave would not be caught without the necessary logs being available.
Instead of letting MySQL age out it binlogs on its own based on the global varible
expire_logs_days
, RDS seems to be managing this process externally, and quite possibly this process is responsible for archiving the logs out of sight so that they can be used for the native RDS point-in-time restoration feature.