Mysql – How to debug a db memory-leak causing thesql to go before it’s own limits

MySQL

We are having a problem with one of the database server of one application, possibly caused by some code that is creating a problem in the way Mysql manage it's memory.

Until the second week of April, our db server has a stable consumption of memory of about 5 gigs (with a maximum of 7 gigs). But then, it started to increase limitless, even surpassing it's theorically maximum possible allocation.

This is our yearly munin graph showing the increase in the last 2 months:


(source: postimg.org)

This is another view from the last seven days after a restart in mysql:


(source: postimg.org)

This is the report created by mysqltuner.pl:

-------- Performance Metrics -------------------------------------------------

[--] Up for: 4d 1h 56m 28s (152M q [431.585 qps], 383K conn, TX: 593B, RX: 29B)
[--] Reads / Writes: 90% / 10%
[--] Total buffers: 5.3G global + 10.2M per thread (200 max threads)

[OK] Maximum possible memory usage: 7.3G (46% of installed RAM)
[OK] Slow queries: 0% (2K/152M)
[OK] Highest usage of available connections: 13% (26/200)
[OK] Key buffer size / total MyISAM indexes: 16.0M/300.0K
[OK] Key buffer hit rate: 100.0% (61M cached / 9 reads)
[OK] Query cache efficiency: 70.8% (103M cached / 146M selects)
[!!] Query cache prunes per day: 501819
[OK] Sorts requiring temporary tables: 0% (926 temp sorts / 3M sorts)
[!!] Joins performed without indexes: 39128
[OK] Temporary tables created on disk: 16% (821K on disk / 5M total)
[OK] Thread cache hit rate: 99% (26 created / 383K connections)
[!!] Table cache hit rate: 10% (845 open / 7K opened)
[OK] Open file limit used: 3% (148/4K)
[OK] Table locks acquired immediately: 99% (65M immediate / 65M locks)
[!!] InnoDB data size / buffer pool: 5.5G/5.0G

We are in unknown territory here. Any help will be appreciated!

Edit: Adding my.cnf

# The MySQL database server configuration file.

[client]
port            = 3306
socket          = /var/run/mysqld/mysqld.sock

[mysqld_safe]
socket          = /var/run/mysqld/mysqld.sock
nice            = 0

[mysqld]
character_set_server = utf8
collation_server = utf8_general_ci

user            = mysql
socket          = /var/run/mysqld/mysqld.sock
pid-file        = /var/run/mysqld/mysqld.pid
port            = 3306
basedir         = /usr
datadir         = /var/lib/mysql
tmpdir          = /tmp
skip-external-locking
bind-address            = 0.0.0.0

# Fine Tuning
max_connections         = 200
key_buffer              = 16M
max_allowed_packet      = 16M
thread_stack            = 192K
join_buffer_size        = 2M
sort_buffer_size        = 2M
read_buffer_size        = 2M
read_rnd_buffer_size    = 4M
thread_cache_size       = 128
thread_concurrency      = 24
table_cache             = 2K
table_open_cache        = 2K
table_definition_cache  = 4K

# This replaces the startup script and checks MyISAM tables if needed
# the first time they are touched
myisam-recover         = BACKUP

# innodb
innodb_buffer_pool_size         = 5G
innodb_flush_log_at_trx_commit  = 1
innodb_support_xa               = 1
innodb_additional_mem_pool_size = 32M
innodb_log_buffer_size          = 8M
innodb_flush_method             = O_DIRECT

# Query Cache Configuration
query_cache_limit               = 32M
query_cache_size                = 256M
query_cache_min_res_unit        = 256

# Logging and Replication
log_error                       = /var/log/mysql/error.log
log-slow-queries                = /var/log/mysql/slow.log
long_query_time                 = 1

# REPLICATION CONFIGURATION
log_bin                 = /var/log/mysql/mysql-bin.log
log-bin                 = mysql-bin
expire_logs_days        = 15
sync_binlog             = 1
server-id               = 1

ssl-ca   =/etc/ssl/private/repl/cacert.pem
ssl-cert =/etc/ssl/private/repl/master-cert.pem
ssl-key  =/etc/ssl/private/repl/master-key.pem

[mysqldump]
quick
quote-names
max_allowed_packet      = 16M

[isamchk]
key_buffer              = 16M                                

Best Answer

...even surpassing it's theorically maximum possible allocation.

[OK] Maximum possible memory usage: 7.3G (46% of installed RAM)

There is not actually a way to calculate maximum possible memory usage for MySQL, because there is no cap on the memory it can request from the system.

The calculation done by mysqltuner.pl is only an estimate, based on a formula that doesn't take into account all possible variables, because if all possible variables were taken into account, the answer would always be "infinite." It's unfortunate that it's labeled this way.

Here is my theory on what's contributing to your excessive memory usage:

thread_cache_size       = 128

Given that your max_connections is set to 200, the value of 128 for thread_cache_size seems far too high. Here's what makes me think this might be contributing to your problem:

When a thread is no longer needed, the memory allocated to it is released and returned to the system unless the thread goes back into the thread cache. In that case, the memory remains allocated.

http://dev.mysql.com/doc/refman/5.6/en/memory-use.html

If your workload causes even an occasional client thread to require a large amount of memory, those threads may be holding onto that memory, then going back to the pool and sitting around, continuing to hold on to memory they don't technically "need" any more, on the premise that holding on to the memory is less costly than releasing it if you're likely to need it again.

I think it's worth a try to do the following, after first making a note of how much memory MySQL is using at the moment.

Note how many threads are currently cached:

mysql> show status like 'Threads_cached';
+----------------+-------+
| Variable_name  | Value |
+----------------+-------+
| Threads_cached | 9     |
+----------------+-------+
1 row in set (0.00 sec)

Next, disable the thread cache.

mysql> SET GLOBAL thread_cache_size = 0;

This disables the thread cache, but the cached threads will stay in the pool until they're used one more time. Disconnect from the server, then reconnect and repeat.

mysql> show status like 'Threads_cached';

Continue disconnecting, reconnecting, and checking until the counter reaches 0.

Then, see how much memory MySQL is holding.

You may see a decrease, possibly significant, and then again you may not. I tested this on one of my systems, which had 9 threads in the cache. Once those threads had all been cleared out of the cache, the total memory held by MySQL did decrease... not by much, but it does illustrate that threads in the cache do release at least some memory when they are destroyed.

If you see a significant decrease, you may have found your problem. If you don't, then there's one more thing that needs to happen, and how quickly it can happen depends on your environment.

If the theory holds that the other threads -- the ones currently servicing active client connections -- have significant memory allocated to them, either because of recent work in their current client session or because of work requiring a lot of memory that was done by another connection prior to them languishing in the pool, then you won't see all of the potential reduction in memory consumption until those threads are allowed to die and be destroyed. Presumably your application doesn't hold them forever, but how long it will take to know for sure whether there's a difference will depend on whether you have the option of cycling your application (dropping and reconnecting the client threads) or if you'll have to just wait for them to be dropped and reconnected over time on their own.

But... it seems like a worthwhile test. You should not see a substantial performance penalty by setting thread_cache_size to 0. Fortunately, thread_cache_size is a dynamic variable, so you can freely change it with the server running.