Mysql – Error 1236 ‘Found old binary log without GTIDs’ after master restart

MySQLreplication

So after quite a few attempts to build a slave and getting the error above, I decided to start with no binlogs from scratch so on the latest slave rebuild I issued PURGE ALL BINLOGS BEFORE NOW() on the master to get rid of all the possible binlogs without GTIDs, then made a full dump with innobackupex and moved it to the slave.
Started the slave up and it worked fine…
Up to the point where I always get this exact error which is when I restart the master….
so after a mysql restart on the master is executed the slave lost connection of course, after reconnecting I got greeted with the usual:

"Got fatal error 1236 from master when reading data from binary log: 'Found old binary log without GTIDs while looking for the oldest binary log that contains any GTID that is not in the given gtid set'"

This is getting ridiculous now, I have rebuilt the slave 4 times now in 3 days.

Here's my my.cnf:

[client]
port            = 3306
socket          = /var/run/mysqld/mysqld.sock

[mysqld_safe]
pid-file        = /var/run/mysqld/mysqld.pid
socket          = /var/run/mysqld/mysqld.sock
nice            = 0
open_files_limit        = 65535

[mysqld]
user            = mysql
pid-file        = /var/run/mysqld/mysqld.pid
socket          = /var/run/mysqld/mysqld.sock
port            = 3306
basedir         = /usr
datadir         = /srv/mysql
tmpdir          = /tmp
lc-messages-dir = /usr/share/mysql
explicit_defaults_for_timestamp

bind-address    = xxxxxxxxx

key_buffer              = 64M
max_allowed_packet      = 16M
thread_stack            = 192K
thread_cache_size       = 16
myisam-recover         = BACKUP
table_open_cache                = 512
open_files_limit        = 65535
interactive_timeout=180
wait_timeout=180

query_cache_limit       = 1M
query_cache_size        = 256M

slow_query_log = 1
slow_query_log_file = /var/log/mysql/mysql-slow.log
long_query_time = 10
max_binlog_size         = 100M
expire_logs_days        = 10

log-error       = /var/log/mysql/error.log
sql_mode=NO_ENGINE_SUBSTITUTION,STRICT_TRANS_TABLES

symbolic-links=0

innodb_file_per_table
innodb_additional_mem_pool_size = 20M
innodb_buffer_pool_size         = 24G
innodb_file_format              = "Barracuda"

binlog-format=MIXED
log-slave-updates=true
log-bin
gtid-mode=on
server-id=1
enforce-gtid-consistency=true

[mysqldump]
quick
quote-names
max_allowed_packet      = 32M

[mysql]

[isamchk]
key_buffer              = 16M

At the time of master restart, the slave was about 4000 seconds behind already (that's what I was trying to solve somehow, but I got this error – again -)

I case you are wondering last time I rebuilt the whole thing was last night, so it is not a case of missing binlogs due to expiry.

Does anyone have any idea?

Thanks

Best Answer

Something is not right about your process.

Usually, when I see error 1236, such the following I used in my old post How can you monitor if MySQL binlog files get corrupted?

[ERROR] Error reading packet from server: Client requested master to start replication from impossible position ( server_errno=1236).
[ERROR] Slave I/O: Got fatal error 1236 from master when reading data from binary log: 'Client requested master to start replication from impossible position', Error_code: 1236
111014 20:25:48 [Note] Slave I/O thread exiting, read up to log 'mysql-bin.001067', position 183468345.

here was the situation: When doing MySQL Replication without GTID, the IO Thread examines the position from the latest Master binlog. If Read_Master_Log_Pos is bigger than the actual filesize of the binlog, you get error 1236.

When doing MySQL Replication with GTID, the situation is somewhat similar. The IO Thread is looking for some kind of closure with regard to the GTID it was last using. When you restarted MySQL on the Master, you closed the last binlog on the Master and opened a new binlog upon startup. The IO Thread on the Slave was still active. Thus, the same error number is coming up.

The next time you restart a Master, remember the Slaves are active.

The slave should have reconnected after a minute, but that is not happening for you.

To play it safe, you should do the following

On the Slave, STOP SLAVE;
On the Master, service mysql restart
On the Slave, START SLAVE;

You should not have to do this. As an alternative, try setting up replication with heartbeat set at one tenth of a second:

CHANGE MASTER TO MASTER_HEARTBEAT_PERIOD = 100;

This should make the IO Thread on the Slave a little more sensitive

Related Solutions

Thesqld_safe version different than thesqld

As we all know, mysqld_safe and mysqld are very different

mysqld : The database server instance daemon

mysqld_safe : Control program that examines and sets the environment for mysqld to execute. The mysqld executable is actually launched in a loop. When mysqld terminates, the mysqld_safe program will examine the return results and decide whether

mysqld terminated normally (intentional shutdown), leaves mysqld_safe
mysqld terminated abnormally (crash or kill -9 of mysqld)
- Loop back, mysqld fails on retry, leaves mysqld_safe
- Loop back, mysqld starts up, stays in the mysqld_safe loop

Why is it important to have mysqld and mysqld_safe using the same MySQL version?

Let me illustrate it this way: Percona Server sometimes has additional features in mysqld_safe for manipulating the OS. For example, I have seen numactl --interleave=all in a Percona Server mysqld_safe. If that line was not there, the mysqld for Percona Server may run into issues with memory and swapping.

The same scenario could possibly be the case for Oracle's (ugh, still hate saying that) mysqld and mysqld_safe. There could be improvements from one major release to another that would be removed if the mysqld_safe was older.

Rather than exploring the possibilities of using a old mysqld_safe and a new mysqld (or vica versa), please make your life simple and reinstall MySQL 5.5.30 from scratch.

Before doing so, please run

updatedb
locate mysqld_safe

in Linux and see if there are two lingering. If there are, get the paths straightened out. Otherwise, you may have to reinstall MySQL 5.5.30.

Mysql – How to debug a db memory-leak causing thesql to go before it’s own limits

...even surpassing it's theorically maximum possible allocation.

[OK] Maximum possible memory usage: 7.3G (46% of installed RAM)

There is not actually a way to calculate maximum possible memory usage for MySQL, because there is no cap on the memory it can request from the system.

The calculation done by mysqltuner.pl is only an estimate, based on a formula that doesn't take into account all possible variables, because if all possible variables were taken into account, the answer would always be "infinite." It's unfortunate that it's labeled this way.

Here is my theory on what's contributing to your excessive memory usage:

thread_cache_size       = 128

Given that your max_connections is set to 200, the value of 128 for thread_cache_size seems far too high. Here's what makes me think this might be contributing to your problem:

When a thread is no longer needed, the memory allocated to it is released and returned to the system unless the thread goes back into the thread cache. In that case, the memory remains allocated.

^{http://dev.mysql.com/doc/refman/5.6/en/memory-use.html}

If your workload causes even an occasional client thread to require a large amount of memory, those threads may be holding onto that memory, then going back to the pool and sitting around, continuing to hold on to memory they don't technically "need" any more, on the premise that holding on to the memory is less costly than releasing it if you're likely to need it again.

I think it's worth a try to do the following, after first making a note of how much memory MySQL is using at the moment.

Note how many threads are currently cached:

mysql> show status like 'Threads_cached';
+----------------+-------+
| Variable_name  | Value |
+----------------+-------+
| Threads_cached | 9     |
+----------------+-------+
1 row in set (0.00 sec)

Next, disable the thread cache.

mysql> SET GLOBAL thread_cache_size = 0;

This disables the thread cache, but the cached threads will stay in the pool until they're used one more time. Disconnect from the server, then reconnect and repeat.

mysql> show status like 'Threads_cached';

Continue disconnecting, reconnecting, and checking until the counter reaches 0.

Then, see how much memory MySQL is holding.

You may see a decrease, possibly significant, and then again you may not. I tested this on one of my systems, which had 9 threads in the cache. Once those threads had all been cleared out of the cache, the total memory held by MySQL did decrease... not by much, but it does illustrate that threads in the cache do release at least some memory when they are destroyed.

If you see a significant decrease, you may have found your problem. If you don't, then there's one more thing that needs to happen, and how quickly it can happen depends on your environment.

If the theory holds that the other threads -- the ones currently servicing active client connections -- have significant memory allocated to them, either because of recent work in their current client session or because of work requiring a lot of memory that was done by another connection prior to them languishing in the pool, then you won't see all of the potential reduction in memory consumption until those threads are allowed to die and be destroyed. Presumably your application doesn't hold them forever, but how long it will take to know for sure whether there's a difference will depend on whether you have the option of cycling your application (dropping and reconnecting the client threads) or if you'll have to just wait for them to be dropped and reconnected over time on their own.

But... it seems like a worthwhile test. You should not see a substantial performance penalty by setting thread_cache_size to 0. Fortunately, thread_cache_size is a dynamic variable, so you can freely change it with the server running.

Best Answer

Related Solutions

Thesqld_safe version different than thesqld

Mysql – How to debug a db memory-leak causing thesql to go before it’s own limits

Related Question