MySQL Performance – Restarting MySQL Makes Application Fast

MySQL

I have a Drupal 7 application running a large Mysql database. It's built on AWS instance (30GB RAM, 8 CPU). Below is the my.cnf copy.

Issue is that my application gets noticeably fast when I restart the mysql which makes me think that current configuration isn't optimal and once the caches get filled, they slow down the server speed.

Any experts who could point out the issue here?

[mysqld]
# moving datadir to /media/db ebs storage
datadir=/media/db/mysql
socket=/var/lib/mysql/mysql.sock
tmpdir=/dev/shm/
# Disabling symbolic-links is recommended to prevent assorted security risks
symbolic-links=0


local-infile=0
log-warnings=2

skip-external-locking
bind-address        = 127.0.0.1
expire_logs_days    = 1
max_binlog_size         = 100M
log-bin=bin.log
log-bin-index=bin-log.index
binlog_format=row
key_buffer_size                = 512M
max_allowed_packet  = 32M
thread_stack        = 292K
myisam_sort_buffer_size = 64M
thread_cache_size = 50
myisam-recover         = BACKUP
max_connections        = 100
max_user_connections = 150
table_cache            = 16384
table_open_cache       = 10240
table_definition_cache = 4096
thread_concurrency     = 8
join_buffer_size = 4M
query_cache_type = 1
query_cache_limit = 256K
query_cache_min_res_unit = 2k
query_cache_size = 200M
innodb_buffer_pool_size= 8G
tmp_table_size=64M
max_heap_table_size=64M
wait_timeout=50
interactive_timeout=50
connect_timeout=10
open-files-limit               = 65535
innodb_flush_log_at_trx_commit = 0
innodb_large_prefix=true
innodb_file_format=barracuda
innodb_file_per_table=true
table_open_cache = 5000
innodb_log_buffer_size= 32M
innodb_log_file_size = 512m
read_rnd_buffer_size = 8M
transaction-isolation = READ-COMMITTED
innodb_lock_wait_timeout = 25
innodb_write_io_threads = 16
innodb_sync_spin_loops = 200
innodb_spin_wait_delay = 24
innodb_read_io_threads = 16
innodb_flush_method = O_DIRECT
innodb_buffer_pool_instances = 8
innodb_autoinc_lock_mode = 2

[mysqldump]
max_allowed_packet  = 32M

[isamchk]
key_buffer  = 32M

[mysqld_safe]
log-error=/var/log/mysqld.log
pid-file=/var/run/mysqld/mysqld.pid
set-variable=long_query_time=1
long_query_time=10
slow_query_log_file =/var/log/mysql/mysql-slow.log

Best Answer

This may be the problem:

query_cache_size = 200M

Don't set it more than 50M. When a write to a table occurs, that cache must be scanned to purge any entries relating to that table. (Most production systems actually benefit from turning the QC off.)

Are you using MyISAM? Or InnoDB? The following values assume a mix of the two, and is sub-optimal for the typical case (all InnoDB):

key_buffer_size                = 512M
innodb_buffer_pool_size= 8G

5.5.58? That's 3 major revisions behind.

You can only rarely "tune your way out of a performance problem". Hopefully, the slowlog will point out that it is simply a matter of adding a composite index or reformulating a query.

Analysis of STATUS and VARIABLES

Observations:

  • Version: 5.5.58-log
  • 30 GB of RAM
  • Uptime = 18:00:31; some GLOBAL STATUS values may not be meaningful yet.
  • You are not running on Windows.
  • Running 64-bit version
  • You appear to be running entirely (or mostly) InnoDB.

The More Important Issues:

Your dataset seems to be rather small, so the small setting for innodb_buffer_pool_size is actually OK, at least until the dataset grows significantly.

Shrink query_cache_size to 50M to improve performance.

The frequency of full table scans and range scans indicates the need to look into the slowlog. Set long_query_time=1

Are the SAVEPOINTs working well for you?

Details and other observations:

( innodb_buffer_pool_size / _ram ) = 8192M / 30720M = 26.7% -- % of RAM used for InnoDB buffer_pool

( (key_buffer_size / 0.20 + innodb_buffer_pool_size / 0.70) / _ram ) = (512M / 0.20 + 8192M / 0.70) / 30720M = 46.4% -- Most of available ram should be made available for caching. -- http://mysql.rjweb.org/doc.php/memory

( table_open_cache ) = 5,000 -- Number of table descriptors to cache -- Several hundred is usually good.

( innodb_buffer_pool_size ) = 8192M -- InnoDB Data + Index cache

( Innodb_buffer_pool_pages_free * 16384 / innodb_buffer_pool_size ) = 396,447 * 16384 / 8192M = 75.6% -- buffer pool free -- buffer_pool_size is bigger than working set; could decrease it

( Innodb_buffer_pool_pages_free / Innodb_buffer_pool_pages_total ) = 396,447 / 524280 = 75.6% -- Pct of buffer_pool currently not in use -- innodb_buffer_pool_size is bigger than necessary?

( Uptime / 60 * innodb_log_file_size / Innodb_os_log_written ) = 64,831 / 60 * 512M / 258605568 = 2,243 -- Minutes between InnoDB log rotations Beginning with 5.6.8, this can be changed dynamically; be sure to also change my.cnf. -- (The recommendation of 60 minutes between rotations is somewhat arbitrary.) Adjust innodb_log_file_size. (Cannot change in AWS.)

( Innodb_rows_deleted / Innodb_rows_inserted ) = 110,734 / 105511 = 1.05 -- Churn -- "Don't queue it, just do it." (If MySQL is being used as a queue.)

( innodb_io_capacity ) = 200 -- I/O ops per second capable on disk . 100 for slow drives; 200 for spinning drives; 1000-2000 for SSDs; multiply by RAID factor.

( innodb_stats_on_metadata ) = innodb_stats_on_metadata = ON -- Re-analyze table when touching stats. -- ON is likely to slow down certain SHOWs and information_schema accesses.

( sync_binlog ) = 0 -- Use 1 for added security, at some cost of I/O =1 may lead to lots of "query end"; =0 may lead to "binlog at impossible position" and lose transactions in a crash, but is faster.

( innodb_print_all_deadlocks ) = innodb_print_all_deadlocks = OFF -- Whether to log all Deadlocks. -- If you are plagued with Deadlocks, turn this on. Caution: If you have lots of deadlocks, this may write a lot to disk.

( net_buffer_length / max_allowed_packet ) = 16,384 / 32M = 0.05%

( query_cache_size ) = 200M -- Size of QC -- Too large = too much overhead. Recommend either 0 or no more than 50M.

( Qcache_lowmem_prunes/Qcache_inserts ) = 942,628/2529864 = 37.3% -- Removal Ratio (pct of time need to prune)

Qcache_lowmem_prunes = 15/second

( (query_cache_size - Qcache_free_memory) / Qcache_queries_in_cache / query_alloc_block_size ) = (200M - 149143456) / 29625 / 8192 = 0.25 -- query_alloc_block_size vs formula -- Adjust query_alloc_block_size

(Some of the other STATUS values indicate that the QC is effective.)

( Select_scan ) = 163,611 / 64831 = 2.5 /sec -- full table scans -- Add indexes / optimize queries (unless they are tiny tables)

( Select_scan / Com_select ) = 163,611 / 2592449 = 6.3% -- % of selects doing full table scan. (May be fooled by Stored Routines.) -- Add indexes / optimize queries

( expire_logs_days ) = 1 -- How soon to automatically purge binlog (after this many days) -- Too large (or zero) = consumes disk space; too small = need to respond quickly to network/machine crash. (Not relevant if log_bin = OFF)

( log_slow_queries ) = log_slow_queries = OFF -- Whether to log slow queries. (Before 5.1.29, 5.6.1)

( slow_query_log ) = slow_query_log = OFF -- Whether to log slow queries. (5.1.12)

( long_query_time ) = 10 -- Cutoff (Seconds) for defining a "slow" query. -- Suggest 1

( Connections ) = 97,624 / 64831 = 1.5 /sec -- Connections -- Increase wait_timeout; use pooling?

( thread_cache_size ) = 50 -- How many extra processes to keep around (Not relevant when using thread pooling) (Autosized as of 5.6.8; based on max_connections) 0 is inefficient for non-Windows; 10 is probably fine. More than 100 may lead to OOM.

( thread_cache_size / Max_used_connections ) = 50 / 25 = 200.0% -- There is no advantage in having the thread cache bigger than your likely number of connections. Wasting space is the disadvantage.

Abnormally small:

Qcache_total_blocks * query_cache_min_res_unit / Qcache_queries_in_cache = 5,435
interactive_timeout = 50
query_cache_limit = 262,144
query_cache_min_res_unit = 2,048
wait_timeout = 50

Abnormally large:

Com_release_savepoint = 1.6 /HR
Com_savepoint = 1.6 /HR
Handler_savepoint = 3.2 /HR
Key_write_requests = 407 /sec
Qcache_free_blocks = 18,324
Select_range / Com_select = 34.2%
innodb_spin_wait_delay = 24
innodb_sync_spin_loops = 200
max_user_connections = 150
table_definition_cache = 4,096

Abnormal strings:

have_symlink = DISABLED
innodb_fast_shutdown = 1
tx_isolation = READ-COMMITTED

Tentative conclusions

  • Nothing jumps out to explain why slows down.
  • There could be some good clues in the slowlog -- such as table scans that don't show up until some time after startup.