I'm running a distributed app in the cloud where clients keep open multiple long-running connections to a MySQL (Percona) server.
After updating to the latest Percona release, the maximum number of simultaneously connected clients has dropped significantly. In the past, with the previous Percona version, it was used successfully with up to 8K connections, now it struggles to get above 3K.
By "struggles" I mean that when the connection bottleneck has been reached, connecting to MySQL times out even in the command line. When I do manage to connect, show processlist
doesn't show any hanging queries or locks. So the connections that are established work just fine. The Apache webserver serving the PHP webapp used to monitor the project also hangs indefinitely.
Here's the my.cnf
I use, patched together from various guides.
[mysqld]
open_files_limit = 16384
table_open_cache = 16384
character_set_server = utf8mb4
max_connections = 16384
expire_logs_days = 10
max_binlog_size = 100M
innodb_open_files = 16384
innodb_file_per_table = 1
innodb_read_io_threads = 4
innodb_write_io_threads = 4
innodb_thread_concurrency = 0
innodb_log_file_size = 128M
innodb_open_files = 4000
innodb_flush_method = O_DIRECT
innodb_buffer_pool_instances = 1
thread_pool_size = 16
local_infile = 1
skip-name-resolve
thread_cache_size = 16384
thread_handling = pool-of-threads
innodb_buffer_pool_size = 512M
innodb_buffer_pool_instances = 1
innodb_log_buffer_size = 64M
And the Percona version I'm running:
mysqld --version
Ver 8.0.15-6 for debian-linux-gnu on x86_64 (Percona Server (GPL), Release '6', Revision '63abd08')
OS: Ubuntu 18.04.2 LTS
Hardware:
- CPU:
Intel(R) Xeon(R) Platinum 8175M CPU @ 2.50GHz
- RAM:
32G
The queries I'm running are all based on unique indexes (no joins), tables are all InnoDB. The same configuration worked fine until the update to Percona 8.
The my.cnf
settings I had to remove after the upgrade as they were no longer supported are:
innodb_locks_unsafe_for_binlog = 0
query_cache_size = 0
query_cache_type = 0
Things I tried:
-
I ran
mysqltuner.pl
but didn't get any relevant advice from it. -
ulimit -n
andulimit -s
show values significantly higher than what I'm using (1048576
and16384
specifically).
Any other suggestions?
Best Answer
There have been only 92
Connections
in the 5 hours ofUptime
, and no more than 4 at once (Max_used_connections
). So I don't understand your question about struggling with 3K connections.Observations:
The More Important Issues:
Extremely high disk activity on the buffer_pool. Since you have 32GB, and assuming it is mostly for MySQL, change
innodb_buffer_pool_size
to22G
.Why so many CHECK, SET, DELETE, SHOW COLUMNS statements?
You seem to have some really big queries. Turn on the slowlog to catch the worst of them. For example, maybe a DELETE is missing a useful index?
The table cache is not very effective, increase
table_open_cache
to2000
.Max_connections = 15574
is quite high. Especially since the most used is only 4. Drop it to 100.Details and other observations:
( Innodb_buffer_pool_reads ) = 28,825,842 / 17660 = 1632 /sec
-- InnoDB buffer_pool I/O read rate -- check innodb_buffer_pool_size( Innodb_buffer_pool_pages_flushed ) = 61,160,824 / 17660 = 3463 /sec
-- Writes (flushes) -- check innodb_buffer_pool_size( innodb_buffer_pool_size / _ram ) = 512M / 32768M = 1.6%
-- % of RAM used for InnoDB buffer_pool( Table_open_cache_misses / (Table_open_cache_hits + Table_open_cache_misses) ) = 979 / (8530 + 979) = 10.3%
-- Effectiveness of table_open_cache. -- Increase table_open_cache and check table_open_cache_instances.( innodb_lru_scan_depth ) = 1,024
-- "InnoDB: page_cleaner: 1000ms intended loop took ..." may be fixed by lowering lru_scan_depth( Innodb_buffer_pool_reads / Innodb_buffer_pool_read_requests ) = 28,825,842 / 997216830 = 2.9%
-- Read requests that had to hit disk -- Increase innodb_buffer_pool_size if you have enough RAM.( Innodb_pages_read / Innodb_buffer_pool_read_requests ) = 50,177,127 / 997216830 = 5.0%
-- Read requests that had to hit disk -- Increase innodb_buffer_pool_size if you have enough RAM.( (Innodb_buffer_pool_reads + Innodb_buffer_pool_pages_flushed) ) = ((28825842 + 61160824) ) / 17660 = 5095 /sec
-- InnoDB I/O -- Increase innodb_buffer_pool_size?( innodb_log_buffer_size / innodb_log_file_size ) = 64M / 48M = 133.3%
-- Buffer is in RAM; file is on disk. -- The buffer_size should be smaller and/or the file_size should be larger.( Innodb_os_log_written / (Uptime / 3600) / innodb_log_files_in_group / innodb_log_file_size ) = 248,832 / (17660 / 3600) / 2 / 48M = 0.0005
-- Ratio -- (see minutes)( Uptime / 60 * innodb_log_file_size / Innodb_os_log_written ) = 17,660 / 60 * 48M / 248832 = 59,535
-- Minutes between InnoDB log rotations Beginning with 5.6.8, this can be changed dynamically; be sure to also change my.cnf. -- (The recommendation of 60 minutes between rotations is somewhat arbitrary.) Adjust innodb_log_file_size. (Cannot change in AWS.)( Innodb_rows_deleted / Innodb_rows_inserted ) = 201 / 0 = INF
-- Churn -- "Don't queue it, just do it." (If MySQL is being used as a queue.)( ( Innodb_pages_read + Innodb_pages_written ) / Uptime / innodb_io_capacity ) = ( 50177127 + 224 ) / 17660 / 200 = 1420.6%
-- If > 100%, need more io_capacity. -- Increase innodb_io_capacity if the drives can handle it.( innodb_io_capacity ) = 200
-- I/O ops per second capable on disk . 100 for slow drives; 200 for spinning drives; 1000-2000 for SSDs; multiply by RAID factor.( expand_fast_index_creation ) = expand_fast_index_creation = OFF
-- ALTER and OPTIMIZE may be greatly sped up by using ON. -- Probably better to be ON.( innodb_thread_concurrency ) = 0
-- 0 = Let InnoDB decide the best for concurrency_tickets. -- Set to 0 or 64. This may cut back on CPU.( innodb_print_all_deadlocks ) = innodb_print_all_deadlocks = OFF
-- Whether to log all Deadlocks. -- If you are plagued with Deadlocks, turn this on. Caution: If you have lots of deadlocks, this may write a lot to disk.( max_connections ) = 15,574
-- Maximum number of connections (threads). Impacts various allocations. -- If max_connections is too high and various memory settings are high, you could run out of RAM.( (Com_show_create_table + Com_show_fields) / Questions ) = (0 + 436) / 1804 = 24.2%
-- Naughty framework -- spending a lot of effort rediscovering the schema. -- Complain to the 3rd party vendor.( local_infile ) = local_infile = ON
-- local_infile = ON is a potential security issue( Select_scan / Com_select ) = 217 / 199 = 109.0%
-- % of selects doing full table scan. (May be fooled by Stored Routines.) -- Add indexes / optimize queries( slow_query_log ) = slow_query_log = OFF
-- Whether to log slow queries.( long_query_time ) = 10
-- Cutoff (Seconds) for defining a "slow" query. -- Suggest 2( back_log ) = 15,574
-- (Autosized as of 5.6.6; based on max_connections) -- Raising to min(150, max_connections) may help when doing lots of connections.( thread_cache_size ) = 16,384
-- How many extra processes to keep around (Not relevant when using thread pooling) (Autosized as of 5.6.8; based on max_connections) -- Greater than 100 may lead to OOM.( thread_cache_size / Max_used_connections ) = 16,384 / 4 = 409600.0%
-- There is no advantage in having the thread cache bigger than your likely number of connections. Wasting space is the disadvantage.Abnormally small:
Abnormally large:
Abnormal strings: