Mysql – InnoDB Master-Master replication goes inconsistant after power failure tests

failoverinnodbMySQLPHPreplication

Afternoon gents,

I'm currently stress testing a Master-Master replication setup using InnoDB as database engine.

We're using this simple script for testing which we run in Linux CLI from a remote server.

<?php

while(true) {
    try {
        $conn = mysql_connect('10.0.10.210', 'test', 'test');
        if ($conn) {
            mysql_select_db('testdb');
            $random = rand(0, 1000);
            $res = mysql_query("INSERT INTO test VALUES(0, 'test',    $random)");
            if ($res) {
                echo "\n inserted " . microtime();
            } else {
                echo "\n not inserted " . microtime();
            }
            mysql_close($conn);
        } else {
            echo "\n can not connect";
        }
    } catch (Exception $ex) {
        echo "\n can not insert"  . microtime();
    }
}

var_dump($res);
echo "ok";

The issues we're facing is that we're trying to shut off one of the hosts using nothing but unplugging the power, a hard power off that is.

We're also using MySQL-MMM for fail-over purposes, but that have nothing to do with the issues we're facing but I'll explain the procedure we're using now.

1) Master-Master working perfectly, server1 having virtual IP 10.0.10.210 and is serving writes and reads

2) We shut off server1 by unplugging the cable, the virtual IP gets moved to server2, everything is working and inserts continue after ~20 seconds downtime.

3) We start server1 again, it goes up and gets back the virtual IP address, inserts continues after 1-2 second downtime.

The problem with this is that we lose all inserts that happened during the downtime of server1, and if I type "STOP SLAVE; START SLAVE;" I get these errors:

[ERROR] Slave I/O: Got fatal error 1236 from master when reading data from binary log: 'Client requested master to start replication from impossible position', Error_code: 1236

And if I check the binary log manually with an offset that corresponds with what the mysqld.log file says:

[root@db1 mysql]# mysqlbinlog --offset=623435 db1-mysql-bin.000001
/*!40019 SET @@session.max_insert_delayed_threads=0*/;
/*!50003 SET @OLD_COMPLETION_TYPE=@@COMPLETION_TYPE,COMPLETION_TYPE=0*/;
DELIMITER /*!*/;
# at 4
#121030 12:55:16 server id 1  end_log_pos 106   Start: binlog v 4,   server v 5.1.61-log created 121030 12:55:16 at startup
# Warning: this binlog is either in use or was not closed properly.
ROLLBACK/*!*/;
BINLOG '
VOqPUA8BAAAAZgAAAGoAAAABAAQANS4xLjYxLWxvZwAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAABU6o9QEzgNAAgAEgAEBAQEEgAAUwAEGggAAAAICAgC
'/*!*/;
ERROR: Error in Log_event::read_log_event(): 'read error', data_len: 112,     event_type: 2
DELIMITER ;
# End of log file
ROLLBACK /* added by mysqlbinlog */;
/*!50003 SET COMPLETION_TYPE=@OLD_COMPLETION_TYPE*/;

I understand that the binary log file aren't getting closed properly but aren't InnoDB supposed to take care of this? Surely a hard power off is not something that's very rare, at least not in my mind. I'm running EXT4 filesystem.

This is so far nothing but a lab setup, in reality we run this in state-of-the-art (not submerged …) data centers with all necessary precautions.

Any light shed on the matter would be greatly appreciated, thank you.

my.cnf

[mysqld]
datadir=/var/lib/mysql
socket=/var/lib/mysql/mysql.sock
user=mysql
symbolic-links=0
sync_binlog=1

# REPLICATION SETTINGS
server_id = 2
replicate-same-server-id = 0
auto-increment-increment = 2
auto-increment-offset = 2
replicate-do-db = test
binlog-ignore-db = mysql

log-bin=db2-mysql-bin
relay-log=db2-relay-log
relay-log-index=relay-log-index

[mysqld_safe]
log-error=/var/log/mysqld.log
pid-file=/var/run/mysqld/mysqld.pid

UPDATE

I've now switched filesystem from EXT4 to XFS and it took indeed care of the loss of data, but now I have another problem yet quite small and should be easy to solve.

After I go through the procedure to shut off server1, fail over to server2, start server1, roll back to server1. Everything keeps working brilliantly and server1 picks up exactly where server2 left off, the only problem is that server2 stops syncing server1 so it's the other way around.

If I run STOP SLAVE; START SLAVE; it starts syncing and after a few seconds are identical as server1, but why doesn't it do this automatically?

Best Answer

Replication and binary logging happen independently from innodb, which can unfortunately cause problems.

Check out: http://dev.mysql.com/doc/refman/5.5/en/replication-options-binary-log.html#sysvar_sync_binlog

From what you describe I suspect that sync_binlog is set to 0 for your servers. Leaving it at 0 means mysql will rely upon the filesystem to handle flushing to disk. Effectively this means that the binlog data will often be in the filesystem cache. That gets flushed to disk by the kernel at some interval, but in the case of a power failure anything in there is lost.

Setting sync_binlog to 1 will force mysql to flush the binlog event to the filesystem using fdatasync after every commit. This is safer (since you will lose 1 transaction at most in case of power failure), but creates a lot more disk activity. Benchmark and see what the impact is for your workload. Knowing the tradeoffs for both scenarios will hopefully help you make an informed decision.

Hope that helps.