MySQL 5.7 + InnoDB: thesqld is CPU bound, with ever increasing memory usage

mysql-5.7

[ADDED ANSWER 2019-10-08] I've posted an answer to this question. tl;dr: Force MySQL to use jemalloc or tcmalloc.

I recently upgraded from 5.5.25-rc (very old install), through 5.5.62, then 5.7.27. At the same time, I made the decision to migrate most (possibly all) tables from MyISAM to InnoDB.

When mysqld is first started, everything runs as expected. It's a busy server with almost constant inserts. But over time:

The mysqld process seems to hit a limit, consistently consuming around 100% CPU (there's four physical cores, so 100% suggests a fully loaded single thread).
Disk activity falls from near 100% to near nil, despite pending inserts continuing to pile up. One insert was manually killed by me at around 40 minutes; it's now 44 hours (nearly 2 days) later, and despite being killed it still appears in SHOW PROCESSLIST, with other processes waiting for it to release the lock.
The memory usage periodically increases, well beyond the value that I would expect MySQL to use, until memory is exhausted and the OS kills the process.

To give an example of the latter, here's selected fields from top at various points: time since starting mysqld, allocated memory, resident memory, CPU:

10 mins:    2827M   1131M   100.39%
1 hour:     2903M   1421M   100.20%
12 hours:   3696M   3109M   111.18%
24 hours:   3846M   3435M   100.00%
45 hours:   4332M   3699M   100.00%   (now)

Note the difference in allocated memory between 10 minutes and 45 hours is an additional 1505M.

[ADDED 2019-10-04] After upgrading the mainboard/RAM/CPU, the behaviour is the same, just with bigger numbers:

10 mins: 9530 mysql        47  20    0    14G  5866M select  11  10:17 117.09% mysql
1 hour:  9530 mysql        46  20    0    15G    14G select   4  42:46  39.50% mysqld
4 hours: 9530 mysql        45  20    0    22G    21G select   9 122:32  25.78% mysqld
7 hours: 9530 mysql        45  20    0    27G    24G select   2 216:44  24.37% mysqld
<process killed by OS>

With the additional available RAM I configured the pool size to 12000M (~12GB), so the MySQL process eventually using 27GB of RAM (more than double the configured pool size) is still an unexpected level of overhead.

Things I've tried to diagnose the issue:

Restarting mysqld (plus also trying a clean reboot of the server). Things return to normal for a while, but the long term behaviour does not change.
Various MySQL memory estimate tools and commands. Some of these are quite old, some do not support FreeBSD, and others do not specify a particular MySQL version or series, so it's difficult to interpret the output. Actual memory usage far exceeds all of the successful estimates.
Enabling performance/sys schema to track memory usage. The changes in total reported memory usage over time are minor (+/- a couple of hundred MB), and do not explain why the actual memory allocated to MySQL continues to grow.
Drive speed tests, SMART self-tests, ZFS scrub. No signs of any storage problems.

At this point I'm completely baffled. I cannot find anyone else who has this problem.

This system is fairly old, but it showed no signs of issues until the recent MySQL upgrade. ~~I do have a new mainboard and RAM on order, and I suspect that the extra RAM (32GB vs 8GB) will make a big difference, but I'm concerned it will just mask whatever this problem is.~~ (New mainboard has been installed, and excessive memory consumption has not changed.)

Thank you in advance for any advice.

ADDITIONAL INFORMATION

System config:

~~Q8400 (I think) CPU~~ Now Xeon E5-2620
~~8GB RAM~~ Now 32GB ECC RAM
MySQL 5.7.27, installed via FreeBSD package manager system
FreeBSD 12.0
2 x 2GB 7200RPM HDDs in ZFS mirror: recordsize=16k, atime=off, compression=lz4. Now with SLOG on dedicated 16GB NVMe M.2 SSD.

[ADDED 2019-10-05]

# su mysql
$ ulimit -a
cpu time               (seconds, -t)  unlimited
file size           (512-blocks, -f)  unlimited
data seg size           (kbytes, -d)  33554432
stack size              (kbytes, -s)  524288
core file size      (512-blocks, -c)  unlimited
max memory size         (kbytes, -m)  unlimited
locked memory           (kbytes, -l)  unlimited
max user processes              (-u)  34232
open files                      (-n)  940113
virtual mem size        (kbytes, -v)  unlimited
swap limit              (kbytes, -w)  unlimited
socket buffer size       (bytes, -b)  unlimited
pseudo-terminals                (-p)  unlimited
kqueues                         (-k)  unlimited
umtx shared locks               (-o)  unlimited

my.cnf : https://pastebin.com/q6WQ55bL

Dump of (1) top (memory usage) (2) SHOW GLOBAL STATUS (3) SHOW GLOBAL VARIABLES : https://pastebin.com/DyqUpbyG (queried every 5 mins; this is the last successful dump before server process was killed.)

Best Answer

See this bug report, where others appear to have the same issue:

https://bugs.mysql.com/bug.php?id=83047

Note that the reports here mention Linux, but I use FreeBSD, so it does not appear to be an OS-specific issue.

The suggested workaround is to try a third party malloc implementation, such as jemalloc, or tcmalloc.

FreeBSD already uses jemalloc as its default allocator (although I was unable to confirm 100% that the mysqld executable I have was actually linked against it), so I tried installing and configuring tcmalloc. (See https://github.com/gperftools/gperftools for source; your OS may have it available as a package.)

[mysqld_safe]
malloc-lib=/usr/lib/libtcmalloc_minimal.so

An hour after restart, allocated memory for the mysqld process was at 5954M.

Now, 31 hours after restart, memory is at 6037M, which is an increase of just over 1%. In addition, that value has not changed for the past 15 hours, which suggests that the mysqld process is in a state where all future allocations can be satisfied internally.

It's alarming that somewhere between 5.6 and 5.7, something in MySQL changed that causes the default malloc implementation to go off the charts, and a third party library is required to fix. I cannot find any official acknowledgement of this bug.

Hope this answer helps anyone else experiencing this very frustrating problem.

Related Solutions

Mysql – Release MySQL Memory and Performance Tunning

MySQL will not release global buffers after they are used (innodb_buffer_pool, key_buffer_size, query_cache_size, etc). It will release memory taken by per-thread buffers (join_buffer_size, sort_buffer_size, etc) but only after the thread is removed and not in the thread_cache.

The innodb_buffer_pool_size can be tuned dynamically in 5.7, but it is a very invasive operation:

When initiating a resizing operation, the operation does not start until all active transactions are completed. Once the resizing operation is in progress, new transactions and operations that require access to the buffer pool must wait until the resizing operation finishes. The exception to this rule is that concurrent access to the buffer pool is permitted while the buffer pool is defragmented and pages are withdrawn during an operation to decrease buffer pool size. A drawback of allowing concurrent access is that it could result in a temporary shortage of available pages while pages are being withdrawn.

Is the additional buffer pool worth it? This query will give you an idea of the current maximum memory that will be used by the InnoDB buffer pool:

SELECT ENGINE, SUM(DATA_LENGTH)/1024/1024/1024 as DATA_LENGTH_IN_GB,  
    SUM(INDEX_LENGTH)/1024/1024/1024 AS INDEX_LENGTH_IN_GB, 
    SUM(DATA_LENGTH+INDEX_LENGTH)/1024/1024/1024 as TOTAL_IN_GB
FROM information_schema.TABLES
WHERE ENGINE IN ("InnoDB") 
    AND TABLE_SCHEMA NOT IN ('mysql', 'information_schema', 'performance_schema')
GROUP BY ENGINE
ORDER BY DATA_LENGTH DESC;

However, that is the total InnoDB size of data+indexes. The working set is what is going to matter during peak times, so this will include your most active tables/rows. If your working set already fits in memory, allocating more is not going to buy much for you. If it doesn't fit in memory, you're going to want to increase the buffer pool regardless of peak/non-peak traffic.

I would recommend reading the internal details section of the MySQL documentation of the InnoDB Buffer Pool as a great primer.

As for the rest of your configuration, the only thing I will say is that your temp_table_size=333M is not doing what you think it is, and is likely too high.

First:

The maximum size for in-memory temporary tables is the minimum of the tmp_table_size and max_heap_table_size values.

This means, with max_heap_table_size defaulting to 16MB, the internal memory tables are going to be converted to on disk temporary files long before reaching 330MB.

Second, you might have increased temp_table_size to avoid temporary on disk table usage. But, there are scenarios where MySQL can never use in-memory temporary tables:

Some conditions prevent the use of an in-memory temporary table, in which case the server uses an on-disk table instead:

Presence of a BLOB or TEXT column in the table

Presence of any string column in a GROUP BY or DISTINCT clause larger than 512 bytes for binary strings or 512 characters for nonbinary strings. (Before MySQL 5.7.3, the limit is 512 bytes regardless of string type.)

Presence of any string column with a maximum length larger than 512 (bytes for binary strings, characters for nonbinary strings) in the SELECT list, if UNION or UNION ALL is used

The SHOW COLUMNS and the DESCRIBE statements use BLOB as the type for some columns, thus the temporary table used for the results is an on-disk table.

I would recommend setting temp_table_size and max_heap_table_size to 32M or 64M (max) and then examining queries for some of the conditions mentioned above. If you can't avoid on-disk temporary tables, hopefully your cloud provider puts the tmpdir on SSD.

Mysql – High CPU system time usage on MySQL server

600q/s write with a flush per commit is probably hitting the limit of your current spinning disks. Switching to SSDs would relieve the pressure.

The quick fix (before getting SSDs) is probably to change to this setting:

innodb_flush_log_at_trx_commit = 2

But read the caveats on making that change.

Having that setting and SSDs would let you grow further.

Another possible fix is to combine some writes into a single COMMIT (where the logic is not violated).

Almost always, high CPU and/or I/O is due to poor indexes and/or poor formulation of queries. Turn on the slowlog with long_query_time=1, wait a while, then see what turns up. With queries in hand, provide SELECT, EXPLAIN SELECT ..., and SHOW CREATE TABLE. Ditto for the write queries. From those, we can probably tame the CPU and/or I/O. Even with your current setting of 3, pt-query-digest might find some interesting things.

Note that with 50 "running" threads, there is a lot of contention; this may be causing the switching, etc, that you noted. We need to get queries to finish faster. With 5.7, the system may keel over with 100 running threads. Increasing beyond about 64, the context switches, mutexes, locks, etc, conspire to slow down every thread, leading to no improvement in throughput while latency goes through the roof.

For a different approach to analyzing the problem, please provide SHOW VARIABLES and SHOW GLOBAL STATUS? More discussion here.

Analysis of VARIABLES & STATUS

(Sorry, nothing jumps out as addressing your Question.)

Observations:

Version: 5.7.20-log
700 GB of RAM
Uptime = 36d 13:21:34
You are not running on Windows.
Running 64-bit version
You appear to be running entirely (or mostly) InnoDB.

The More Important Issues:

A lot of temp tables, especially disk-based, are created for complex queries. Let's hope that the slow log will identifiy some queries that can be improved (via indexing / reformulating / etc.) Other indicators are joins without indexes and sort_merge_passes; however, neither of these is conclusive, we need to see the queries.

Max_used_connections = 701 is >= Max_connections = 700, so there were probably some connections refused. Also, if that indicated more than, say, 64 threads running, then system performance probably sufferd at that time. Consider throttling the number of connections by throttling the clients. Are you using Apache, Tomcat, or something else? 70 Threads_running indicates that, at the moment of doing this SHOW, the system was in trouble.

Increasing the number of statements in each COMMIT (when reasonable) may help performance some.

innodb_log_file_size, at 15GB, is bigger than necessary, but I see not need to change it.

Thousands of tables is usually not a good design.

eq_range_index_dive_limit = 200 concerns me, but I don't know how to advise. Was it a deliberate choice?

Why so many CREATE+DROP PROCEDURE ?

Why so many SHOW commands?

Details and other observations:

( Innodb_buffer_pool_pages_flushed ) = 523,716,598 / 3158494 = 165 /sec -- Writes (flushes) -- check innodb_buffer_pool_size

( table_open_cache ) = 10,000 -- Number of table descriptors to cache -- Several hundred is usually good.

( (Innodb_buffer_pool_reads + Innodb_buffer_pool_pages_flushed) ) = ((61,040,718 + 523,716,598) ) / 3158494 = 185 /sec -- InnoDB I/O

( Innodb_dblwr_pages_written/Innodb_pages_written ) = 459,782,684/523,717,889 = 87.8% -- Seems like these values should be equal?

( Innodb_os_log_written ) = 1,071,443,919,360 / 3158494 = 339226 /sec -- This is an indicator of how busy InnoDB is. -- Very busy InnoDB.

( Innodb_log_writes ) = 110,905,716 / 3158494 = 35 /sec

( Innodb_os_log_written / (Uptime / 3600) / innodb_log_files_in_group / innodb_log_file_size ) = 1,071,443,919,360 / (3158494 / 3600) / 2 / 15360M = 0.0379 -- Ratio -- (see minutes)

( Uptime / 60 * innodb_log_file_size / Innodb_os_log_written ) = 3,158,494 / 60 * 15360M / 1071443919360 = 791 -- Minutes between InnoDB log rotations Beginning with 5.6.8, this can be changed dynamically; be sure to also change my.cnf. -- (The recommendation of 60 minutes between rotations is somewhat arbitrary.) Adjust innodb_log_file_size. (Cannot change in AWS.)

( Com_rollback ) = 770,457 / 3158494 = 0.24 /sec -- ROLLBACKs in InnoDB. -- An excessive frequency of rollbacks may indicate inefficient app logic.

( Innodb_row_lock_waits ) = 632,292 / 3158494 = 0.2 /sec -- How often there is a delay in getting a row lock. -- May be caused by complex queries that could be optimized.

( Innodb_dblwr_writes ) = 97,725,876 / 3158494 = 31 /sec -- "Doublewrite buffer" writes to disk. "Doublewrites" are a reliability feature. Some newer versions / configurations don't need them. -- (Symptom of other issues)

( Innodb_row_lock_current_waits ) = 13 -- The number of row locks currently being waited for by operations on InnoDB tables. Zero is pretty normal. -- Something big is going on?

( innodb_print_all_deadlocks ) = OFF -- Whether to log all Deadlocks. -- If you are plagued with Deadlocks, turn this on. Caution: If you have lots of deadlocks, this may write a lot to disk.

( local_infile ) = ON -- local_infile = ON is a potential security issue

( bulk_insert_buffer_size / _ram ) = 8M / 716800M = 0.00% -- Buffer for multi-row INSERTs and LOAD DATA -- Too big could threaten RAM size. Too small could hinder such operations.

( Questions ) = 9,658,430,713 / 3158494 = 3057 /sec -- Queries (outside SP) -- "qps" -- >2000 may be stressing server

( Queries ) = 9,678,805,194 / 3158494 = 3064 /sec -- Queries (including inside SP) -- >3000 may be stressing server

( Created_tmp_tables ) = 1,107,271,497 / 3158494 = 350 /sec -- Frequency of creating "temp" tables as part of complex SELECTs.

( Created_tmp_disk_tables ) = 297,023,373 / 3158494 = 94 /sec -- Frequency of creating disk "temp" tables as part of complex SELECTs -- increase tmp_table_size and max_heap_table_size. Check the rules for temp tables on when MEMORY is used instead of MyISAM. Perhaps minor schema or query changes can avoid MyISAM. Better indexes and reformulation of queries are more likely to help.

( (Com_insert + Com_update + Com_delete + Com_replace) / Com_commit ) = (693300264 + 214511608 + 37537668 + 0) / 1672382928 = 0.565 -- Statements per Commit (assuming all InnoDB) -- Low: Might help to group queries together in transactions; High: long transactions strain various things.

( Select_full_join ) = 338,957,314 / 3158494 = 107 /sec -- joins without index -- Add suitable index(es) to tables used in JOINs.

( Select_full_join / Com_select ) = 338,957,314 / 6763083714 = 5.0% -- % of selects that are indexless join -- Add suitable index(es) to tables used in JOINs.

( Select_scan ) = 124,606,973 / 3158494 = 39 /sec -- full table scans -- Add indexes / optimize queries (unless they are tiny tables)

( Sort_merge_passes ) = 1,136,548 / 3158494 = 0.36 /sec -- Heafty sorts -- Increase sort_buffer_size and/or optimize complex queries.

( Com_insert + Com_delete + Com_delete_multi + Com_replace + Com_update + Com_update_multi ) = (693300264 + 37537668 + 198418338 + 0 + 214511608 + 79274476) / 3158494 = 387 /sec -- writes/sec -- 50 writes/sec + log flushes will probably max out I/O write capacity of normal drives

( ( Com_stmt_prepare - Com_stmt_close ) / ( Com_stmt_prepare + Com_stmt_close ) ) = ( 39 - 38 ) / ( 39 + 38 ) = 1.3% -- Are you closing your prepared statements? -- Add Closes.

( Com_stmt_close / Com_stmt_prepare ) = 38 / 39 = 97.4% -- Prepared statements should be Closed. -- Check whether all Prepared statements are "Closed".

( innodb_autoinc_lock_mode ) = 1 -- Galera: desires 2 -- 2 = "interleaved"; 1 = "consecutive" is typical; 0 = "traditional".

( Max_used_connections / max_connections ) = 701 / 700 = 100.1% -- Peak % of connections -- increase max_connections and/or decrease wait_timeout

( Threads_running - 1 ) = 71 - 1 = 70 -- Active threads (concurrency when data collected) -- Optimize queries and/or schema

Abnormally large: (Most of these stem from being a very busy system.)

Com_commit = 529 /sec
Com_create_procedure = 0.01 /HR
Com_drop_procedure = 0.01 /HR
Com_delete = 12 /sec
Com_delete_multi = 63 /sec
Com_insert = 219 /sec
Com_kill = 0.69 /HR
Com_reset = 0.0011 /HR
Com_revoke = 0.0023 /HR
Com_select = 2141 /sec
Com_show_binlogs = 12 /HR
Com_show_create_func = 0.011 /HR
Com_show_privileges = 0.0034 /HR
Com_show_profile = 0.027 /HR
Com_show_profiles = 0.028 /HR
Com_show_slave_status = 0.037 /sec
Com_show_storage_engines = 12 /HR
Com_show_warnings = 0.14 /sec
Com_slave_stop = 0.0011 /HR
Com_update_multi = 25 /sec
Created_tmp_files = 0.3 /sec
Handler_commit = 3251 /sec
Handler_external_lock = 18787 /sec
Handler_prepare = 615 /sec
Handler_read_first = 239 /sec
Handler_read_key = 173669 /sec
Handler_read_next = 1291439 /sec
Handler_read_prev = 28535 /sec
Handler_read_rnd = 32789 /sec

(continued)

Innodb_buffer_pool_bytes_dirty = 7.03e+10
Innodb_buffer_pool_pages_data = 3.41e+7
Innodb_buffer_pool_pages_dirty = 4.29e+6
Innodb_buffer_pool_pages_misc = 2.15e+6
Innodb_buffer_pool_pages_total = 3.62e+7
Innodb_data_fsyncs = 132 /sec
Innodb_data_writes = 232 /sec
Innodb_data_written = 5440151 /sec
Innodb_dblwr_pages_written = 145 /sec
Innodb_os_log_written / (Uptime / 3600) / innodb_log_files_in_group = 582.3MB
Innodb_pages_written = 165 /sec
Innodb_row_lock_time = 5.97e+7
Innodb_rows_deleted + Innodb_rows_inserted = 2180 /sec
Innodb_rows_inserted = 2155 /sec
Innodb_rows_read = 1398531 /sec
Max_used_connections = 701
Open_tables = 10,000
Select_full_range_join = 2.57e+7
Select_range = 130 /sec
Sort_range = 30 /sec
Sort_scan = 332 /sec
Table_open_cache_hits = 9354 /sec
Threads_running = 71
eq_range_index_dive_limit = 200
innodb_purge_threads = 4
innodb_thread_sleep_delay = 16,925

Best Answer

Related Solutions

Mysql – Release MySQL Memory and Performance Tunning

Mysql – High CPU system time usage on MySQL server

Related Question