Mysql – 1236 On Slave After Master Master Failure

multi-masterMySQLmysql-5.6replication

I have a master (master 1) that replicates to another master (master 2), which then replicates to its slave.

So master 2 had an issue with the binary logs:

Relay log read failure: Could not parse relay log event entry. The possible reasons are: the master's binary log is corrupted (you can check this by running 'mysqlbinlog' on the binary log), the slave's relay log is corrupted (you can check this by running 'mysqlbinlog' on the relay log), a network problem, or a bug in the master's or slave's MySQL code. If you want to check the master's binary log or slave's relay log, you will be able to know their names by issuing 'SHOW SLAVE STATUS' on this slave.

Which I've run into before and was able to resolve by running the same commands I ran from this thread, expire_logs_days directive requires change master? (I'll ask a question later about why this is still occuring, not my current issue though).

If interested here's the relevant bit from that thread for how I got it working:

stop slave;
reset slave;
change master to master_log_file='...' , master_log_pos=...
start slave;

so now master 2 is good and replicating from master 1. This time though the slave of master 2 broke as well.

The error I'm getting is:

Got fatal error 1236 from master when reading data from binary log: 'Client requested master to start replication from position > file size; the first event 'mysqld-bin.000397' at 244145356, the last event read from './mysqld-bin.000397' at 4, the last byte read from './mysqld-bin.000397' at 4.'

When I run SHOW BINARY LOGS; on the slave I get:

ERROR 1381 (HY000): You are not using binary logging

What happened to my slave? Why's it not working with the logs anymore? The master still has the log file so I figured running the stop, reset, change, start would resolve the issue (because it would re-request the logs) but it didn't.

Best Answer

I'll bet sync_binlog was turned off.

With it off, the binlog entries just before the crash may not have been flushed to the binlog file, even though they have been sent to the Slave.

Sending replication data from Master to Slave:

Write to the table on the Master.
Buffer up the write to the binlog.
Optionally flush the buffer to the binlog. -- Controlled by sync_binlog
Send the query to the Slave(s).

When sync_binlog=OFF, there is a big chance that the binlog will be shorter than what the slave thinks it should be.

When the Slave-Master connection is reestablished, the Slave picks up where it left off. With sync_binlog=ON, that would be at the exact end of some binlog, and it would decide to move to the next binlog. The manual CHANGE MASTER simulates that.

The CHANGE MASTER to position 0 (or 4) of the next binlog (bump the number by 1).

(I have never used RESET SLAVE; I see not reason for it.)

Related Solutions

Mysql – Slave start to generate relay log files but not master server

You need a log-bin entry in your master my.cnf file

Per http://dev.mysql.com/doc/refman/5.0/en/binary-log.html

To enable the binary log, start the server with the --log-bin[=base_name] option. If no base_name value is given, the default name is the value of the pid-file option (which by default is the name of host machine) followed by -bin. If the basename is given, the server writes the file in the data directory unless the basename is given with a leading absolute path name to specify a different directory.

After setting the log-bin and restarting the master, you then can run RESET MASTER; and then SHOW MASTER STATUS; to get the correct values for your CHANGE MASTER command for your slave.

Mysql – Looking for an efficient way to fix “Could not parse relay log event entry…” error

There is more stable approach you can try

Here is something to remember

Whenever you run CHANGE MASTER TO, it will erase every relay log you have. You do not want to keep relay logs of commands you have not executed any SQL on as of yet

The following is an excerpt taken from a post I made back on Feb 03, 2012 : How to resolve the master server shut down/unavailability in mysql with master - slave replication :

Please notice that there are two sets of replication coordinates from the Master

(Master_Log_File,Read_Master_Log_Pos)

(Relay_Master_Log_File,Exec_Master_Log_Pos)

There is a major difference between them

(Master_Log_File,Read_Master_Log_Pos) tells you the last binlog statement from the Master's log file and log position that the Slave read from the Master and placed in its Relay Logs.

(Relay_Master_Log_File,Exec_Master_Log_Pos) tells you the last binlog statement from the Master's log file and log position that the Slave read from the Master and placed in its Relay Logs THAT IS NEXT TO BE EXECUTED ON THE SLAVE.

What you want are two things:

Erase Every Binary Log You Have
Start Collecting Binary Log Entries From the Last SQL You Successfully Executed.

In your case, you must use the second set of Replication Coordinates

Relay_Master_Log_File
Exec_Master_Log_Pos

It is easy to distrust a corrupt relay log as shown in the error message. The one that hurts the most is a corrupt Master Log. You will have to jump through hoops if that is the case. On the other hand, if one of the other situations was the reason for the corrupt relay log, the simplest and most concise approach is what I stated.

To make sure, whatever is reported for Relay_Master_Log_File, if that particular binary log still exists on the Master, perform a mysqlbinlog on it. If it dumps in its entirety without corrupt characters, go ahead and use the second set of replication coordinates.

From my same earlier post

mysql> show slave status\G
*************************** 1. row ***************************
             Slave_IO_State: Waiting for master to send event
                Master_Host: 10.48.20.253
                Master_User: replicant
                Master_Port: 3306
              Connect_Retry: 60
            Master_Log_File: mysql-bin.000254
        Read_Master_Log_Pos: 858190247
             Relay_Log_File: relay-bin.066069
              Relay_Log_Pos: 873918
      Relay_Master_Log_File: mysql-bin.000254
           Slave_IO_Running: Yes
          Slave_SQL_Running: Yes
            Replicate_Do_DB:
        Replicate_Ignore_DB:
         Replicate_Do_Table:
     Replicate_Ignore_Table:
    Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
                 Last_Errno: 0
                 Last_Error:
               Skip_Counter: 0
        Exec_Master_Log_Pos: 858190247
            Relay_Log_Space: 873772
            Until_Condition: None
             Until_Log_File:
              Until_Log_Pos: 0
         Master_SSL_Allowed: No
         Master_SSL_CA_File:
         Master_SSL_CA_Path:
            Master_SSL_Cert:
          Master_SSL_Cipher:
             Master_SSL_Key:
      Seconds_Behind_Master: 0
1 row in set (0.00 sec)

notice that the Replication Coordinates from SHOW SLAVE STATUS\G for what was last executed are (mysql-bin.000254,858190247). The CHANGE MASTER TO command in this case would be:

CHANGE MASTER TO master_log_file='mysql-bin.000254',master_log_pos=858190247;

Give it a Try !!!

UPDATE 2012-09-14 16:38 EDT

If you worried about the stockpiling relay logs, just throttle the relay logs. In SHOW SLAVE STATUS\G, there is a field called Relay_Log_Space. That gives you the sum of all relay sizes in bytes. Did you know you could put a cap on that number ?

The option is called relay_log_space_limit.

For example, if you want to cap the total number of bytes to 10G, do the following

STEP 01) Add this to /etc/my.cnf on the Slave

[mysqld]
relay_log_space_limit = 10G

STEP 02) Run service mysql restart on the Slave

and that's it !!!

When the oldest relay has all its entries processed, it is deleted and a new relay log is created. That gets filled until all relay logs add up to 10G. That's the only way to control runaway relay log space issues.

UPDATE 2012-09-14 18:10 EDT

SUGGESTION : If you make mysqldump backups of the data on the Slave every midnight, you could set up the following to restrict having 1TB of binary logs:

STEP 01) Add this to /etc/my.cnf on the Master

[mysqld]
expire_logs_days = 14

STEP 02) Run this query on the Master

mysql> PURGE BINARY LOGS BEFORE DATE(NOW()) - INTERVAL 14 DAY;

STEP 03) service mysql restart on the Master

STEP 04) Add a mysqldump backup script to a crontab on the Slave

This will make the Slave more useful and would control having excess binary logs to worry about