Mysql – Configure MySQL to minimize file locks

MySQLmysql-5.7performanceperformance-tuning

*** EDIT ***
8GB RAM

SHOW VARIABLES: https://pastebin.com/ANAiq1E2

SHOW GLOBAL STATUS: https://pastebin.com/pwWKRiie

*** END EDIT ***

To start off, I'm not a DBA, so I may have overlooked something simple.

Backstory, we currently host all our client accounts off of a VPS. VPS has 42 cores available, 8GB of ram, and SSD storage. Most of our databases are from CMS (Joomla or WordPress) with a dozen Symfony applications. Current table count is just over 15K with ~4.2GB of data. The initial problem started with pages becoming unresponsive for a few seconds at a time. After digging into the logs I found multiple errors like these

2020-11-27T19:50:35.218190Z 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 5187ms. The settings might not be optimal. (flushed=8 and evicted=0, during the time.)

2020-11-27T20:17:32.699558Z 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 4027ms. The settings might not be optimal. (flushed=14 and evicted=0, during the time.) 

The flush times were often as high as 20 seconds. I started researching fixes for it and eventually moved all tables to InnoDB, increased innodb buffer pool, increased table cache size, increased table open cache, and increased file limits. That resulted in file lock errors.

2020-11-27T16:49:38.736243Z 378 [ERROR] InnoDB: Unable to lock ./sedashoa_feb2718/zys4l_easyblog_featured.ibd error: 37
2020-11-27 10:49:38 0x7f6009133700  InnoDB: Assertion failure in thread 140050445842176 in file fil0fil.cc line 906

I wasted a lot of time tweaking the settings back down, since the first time I contacted my hosting provider about the issues, they seemed to think it was MySQL memory usage. After tweaking everything down to where the max MySQL memory usage was under 3GB and raising table open cache back, I started getting the same error: 37 crashes. This time, when I contacted hosting support, they informed me I was hitting the file ops limit. So I lowered table open cache back down to 1000.

My question is how should I configure MySQL to make the most out of the resources available? I already know we need to either move on to a dedicated host or to a managed SQL server like Digital Ocean offers. In the meantime, what should I tweak to make the most out of the ram available (figure 4GB just for MySQL) without running into the file lock limits. This is the current config.

# Default options are read from the following files in the given order:
# /etc/my.cnf /etc/mysql/my.cnf /usr/etc/my.cnf ~/.my.cnf

#[mysql]
#port                            = 3306
#socket                          = /var/run/mysqld/mysqld.sock

[mysqld]
# Required Settings
basedir                         = /usr
bind_address                    = 127.0.0.1 # Change to 0.0.0.0 to allow remote connections
datadir                         = /var/lib/mysql
default-time-zone               = '-5:00'
max_allowed_packet              = 32M       # https://dev.mysql.com/doc/refman/5.7/en/server-system-variables.html#sysvar_max_allowed_packet
max_connect_errors              = 1000000
performance-schema              = 1     # turn performance schema on
pid_file                        = /var/run/mysqld/mysqld.pid
port                            = 3306
skip_external_locking
#skip_name_resolve
#socket                          = /var/run/mysqld/mysqld.sock

# Enable for b/c with databases created in older MySQL/MariaDB versions (e.g. when using null dates)
sql_mode                        = ERROR_FOR_DIVISION_BY_ZERO,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTITUTION,ONLY_FULL_GROUP_BY,STRICT_TRANS_TABLES

tmpdir                          = /tmp
user                            = mysql

# InnoDB Settings
default_storage_engine          = InnoDB
innodb_buffer_pool_instances    = 2     # Use 1 instance per 1GB of InnoDB pool size
innodb_buffer_pool_size         = 2G    # Use up to 70-80% of RAM (working off 4GB or half VM RAM)
innodb_lru_scan_depth           = 256   # Default 1024 https://dev.mysql.com/doc/refman/5.7/en/innodb-parameters.html#sysvar_innodb_lru_scan_depth
innodb_file_per_table           = 1
innodb_flush_log_at_trx_commit  = 1         # https://dev.mysql.com/doc/refman/5.7/en/innodb-parameters.html#sysvar_innodb_flush_log_at_trx_commit
innodb_flush_method             = O_DIRECT  # https://dev.mysql.com/doc/refman/5.7/en/innodb-parameters.html#sysvar_innodb_flush_method
innodb_log_buffer_size          = 16M       # https://dev.mysql.com/doc/refman/5.7/en/innodb-parameters.html#sysvar_innodb_log_buffer_size
innodb_log_file_size            = 256M
innodb_stats_on_metadata        = 0         # https://dev.mysql.com/doc/refman/5.7/en/innodb-parameters.html#sysvar_innodb_stats_on_metadata
innodb_use_native_aio           = 0     # MySQL will not start with native asynchronous file access enabled

#innodb_temp_data_file_path     = ibtmp1:64M:autoextend:max:20G # Control the maximum size for the ibtmp1 file
innodb_thread_concurrency       = 16     # Optional: Set to the number of CPUs on your system (minus 1 or 2) to better
                                        # contain CPU usage. E.g. if your system has 8 CPUs, try 6 or 7 and check
                                        # the overall load produced by MySQL/MariaDB.
innodb_read_io_threads          = 8
innodb_write_io_threads         = 8

# MyISAM Settings - disable cache
# query_cache_limit               = 4M    # UPD - Option supported by MariaDB & up to MySQL 5.7, remove this line on MySQL 8.x
query_cache_size                = 0     # UPD - Option supported by MariaDB & up to MySQL 5.7, remove this line on MySQL 8.x
query_cache_type                = 0     # Option supported by MariaDB & up to MySQL 5.7, remove this line on MySQL 8.x

key_buffer_size                 = 10M   # UPD

low_priority_updates            = 1         # https://dev.mysql.com/doc/refman/5.7/en/server-system-variables.html#sysvar_low_priority_updates
concurrent_insert               = 2         # Same as auto https://dev.mysql.com/doc/refman/5.7/en/server-system-variables.html#sysvar_concurrent_insert

# Connection Settings
max_connections                 = 25   # UPD
max_user_connections            = 24

back_log                        = 60
thread_cache_size               = 100       # https://dev.mysql.com/doc/refman/5.7/en/server-system-variables.html#sysvar_thread_cache_size
thread_stack                    = 256K      # Default value for 64 bit platforms    See: https://dev.mysql.com/doc/refman/5.7/en/server-system-variables.html#sysvar_thread_stack

interactive_timeout             = 180
wait_timeout                    = 60

# For MySQL 5.7+ only (disabled by default)
#max_execution_time             = 30000 # Set a timeout limit for SELECT statements (value in milliseconds).
                                        # This option may be useful to address aggressive crawling on large sites,
                                        # but it can also cause issues (e.g. with backups). So use with extreme caution and test!
                                        # More info at: https://dev.mysql.com/doc/refman/5.7/en/server-system-variables.html#sysvar_max_execution_time

# For MariaDB 10.1.1+ only (disabled by default)
#max_statement_time             = 30    # The equivalent of "max_execution_time" in MySQL 5.7+ (set above)
                                        # The variable is of type double, thus you can use subsecond timeout.
                                        # For example you can use value 0.01 for 10 milliseconds timeout.
                                        # More info at: https://mariadb.com/kb/en/aborting-statements/

# Buffer Settings
join_buffer_size                = 512K  # UPD
read_buffer_size                = 1M    # UPD
read_rnd_buffer_size            = 512K  # UPD
sort_buffer_size                = 2M    # UPD

# Table Settings
# In systemd managed systems like Ubuntu 16.04+ or CentOS 7+, you need to perform an extra action for table_open_cache & open_files_limit
# to be overriden (also see comment next to open_files_limit).
# E.g. for MySQL 5.7, please check: https://dev.mysql.com/doc/refman/5.7/en/using-systemd.html
# and for MariaDB check: https://mariadb.com/kb/en/library/systemd/

# ***Non crashing, but with slow cache flush ***
table_definition_cache          = 20000 # UPD   https://dev.mysql.com/doc/refman/5.7/en/server-system-variables.html#sysvar_table_definition_cache
table_open_cache                = 1000  # UPD Total tables on 2020-11-24=15,617  https://dev.mysql.com/doc/refman/5.7/en/server-system-variables.html#sysvar_table_open_cache
open_files_limit                = 2000  # UPD - This can be 2x to 3x the table_open_cache value or match the system's
                                        # open files limit usually set in /etc/sysctl.conf or /etc/security/limits.conf
                                        # In systemd managed systems this limit must also be set in:
                                        # /etc/systemd/system/mysqld.service.d/override.conf (for MySQL 5.7+) and
                                        # /etc/systemd/system/mariadb.service.d/override.conf (for MariaDB)
# ***Crashes because of VPS op limit ***
#table_definition_cache          = 20000
#table_open_cache                = 20000
#open_files_limit                = 40000

# max_heap_table_size and tmp_table_size should be changed together and kept equal
max_heap_table_size             = 16M
tmp_table_size                  = 16M

# Search Settings
ft_min_word_len                 = 3     # Minimum length of words to be indexed for search results

# Logging
log_error                       = /var/lib/mysql/mysql_error.log
log_queries_not_using_indexes   = 1
long_query_time                 = 5
slow_query_log                  = 0     # Disabled for production
slow_query_log_file             = /var/lib/mysql/mysql_slow.log

[mysqldump]
# Variable reference
# For MySQL 5.7: https://dev.mysql.com/doc/refman/5.7/en/mysqldump.html
# For MariaDB:   https://mariadb.com/kb/en/library/mysqldump/
quick
quote_names
max_allowed_packet              = 64M

Any ideas or suggestions would be appreciated.

Best Answer

Analysis of GLOBAL STATUS and VARIABLES:

Observations:

  • Version: 5.7.32
  • 8 GB of RAM
  • Uptime = 2d 20:53:10
  • You are not running on Windows.
  • Running 64-bit version
  • You appear to be running entirely (or mostly) InnoDB.

The More Important Issues:

table_open_cache = 5000 (now 1000)

innodb_flush_neighbors = 0

innodb_io_capacity = 1000
innodb_io_capacity_max = 2000

max_connections is 25. You have hit that. Increase that to, say, 100. And investigate why so many simultaneous connections are occurring.

thread_cache_size should not be bigger than max_connections, and can be lower.

Enable the slowlog and have long_query_time=1. It should help you find the inefficient queries. See http://mysql.rjweb.org/doc.php/mysql_analysis#slow_queries_and_slowlog

Details and other observations:

( (key_buffer_size / 0.20 + innodb_buffer_pool_size / 0.70) ) = ((10M / 0.20 + 2048M / 0.70)) / 8192M = 36.3% -- Most of available ram should be made available for caching. -- http://mysql.rjweb.org/doc.php/memory

( Opened_tables ) = 5,271,412 / 247990 = 21 /sec -- Frequency of opening Tables -- increase table_open_cache (now 1000)

( Table_open_cache_overflows ) = 5,269,803 / 247990 = 21 /sec -- May need to increase table_open_cache (now 1000)

( Table_open_cache_misses ) = 5,271,405 / 247990 = 21 /sec -- May need to increase table_open_cache (now 1000)

( Table_open_cache_misses / (Table_open_cache_hits + Table_open_cache_misses) ) = 5,271,405 / (23504604 + 5271405) = 18.3% -- Effectiveness of table_open_cache. -- Increase table_open_cache (now 1000) and check table_open_cache_instances (now 16).

( Innodb_buffer_pool_pages_free / Innodb_buffer_pool_pages_total ) = 36,493 / 131056 = 27.8% -- Pct of buffer_pool currently not in use -- innodb_buffer_pool_size (now 2147483648) is bigger than necessary?

( innodb_io_capacity_max / innodb_io_capacity ) = 2,000 / 200 = 10 -- Capacity: max/plain -- Recommend 2. Max should be about equal to the IOPs your I/O subsystem can handle. (If the drive type is unknown 2000/200 may be a reasonable pair.)

( innodb_flush_neighbors ) = 1 -- A minor optimization when writing blocks to disk. -- Use 0 for SSD drives; 1 for HDD.

( innodb_io_capacity ) = 200 -- I/O ops per second capable on disk . 100 for slow drives; 200 for spinning drives; 1000-2000 for SSDs; multiply by RAID factor.

( innodb_print_all_deadlocks ) = innodb_print_all_deadlocks = OFF -- Whether to log all Deadlocks. -- If you are plagued with Deadlocks, turn this on. Caution: If you have lots of deadlocks, this may write a lot to disk.

( max_connections ) = 25 -- Maximum number of connections (threads). Impacts various allocations. -- If max_connections (now 25) is too high and various memory settings are high, you could run out of RAM.

( innodb_ft_result_cache_limit ) = 2,000,000,000 / 8192M = 23.3% -- Byte limit on FULLTEXT resultset. (Possibly not preallocated, but grows?) -- Lower the setting.

( character_set_server ) = character_set_server = latin1 -- Charset problems may be helped by setting character_set_server (now latin1) to utf8mb4. That is the future default.

( (Com_show_create_table + Com_show_fields) / Questions ) = (0 + 652913) / 23566357 = 2.8% -- Naughty framework -- spending a lot of effort rediscovering the schema. -- Complain to the 3rd party vendor.

( local_infile ) = local_infile = ON -- local_infile (now ON) = ON is a potential security issue

( Created_tmp_disk_tables ) = 1,341,546 / 247990 = 5.4 /sec -- Frequency of creating disk "temp" tables as part of complex SELECTs -- increase tmp_table_size (now 16777216) and max_heap_table_size (now 16777216). Check the rules for temp tables on when MEMORY is used instead of MyISAM. Perhaps minor schema or query changes can avoid MyISAM. Better indexes and reformulation of queries are more likely to help.

( Created_tmp_disk_tables / Questions ) = 1,341,546 / 23566357 = 5.7% -- Pct of queries that needed on-disk tmp table. -- Better indexes / No blobs / etc.

( Created_tmp_disk_tables / Created_tmp_tables ) = 1,341,546 / 1531462 = 87.6% -- Percent of temp tables that spilled to disk -- Maybe increase tmp_table_size (now 16777216) and max_heap_table_size (now 16777216); improve indexes; avoid blobs, etc.

( Select_scan ) = 4,480,575 / 247990 = 18 /sec -- full table scans -- Add indexes / optimize queries (unless they are tiny tables)

( Select_scan / Com_select ) = 4,480,575 / 18327133 = 24.4% -- % of selects doing full table scan. (May be fooled by Stored Routines.) -- Add indexes / optimize queries

( expire_logs_days ) = 0 -- How soon to automatically purge binlog (after this many days). Being replaced by binlog_expire_logs_seconds. -- Too large (or zero) = consumes disk space; too small = need to respond quickly to network/machine crash. (Not relevant if log_bin (now OFF) = OFF)

( innodb_autoinc_lock_mode ) = 1 -- Galera: desires 2 -- 2 = "interleaved"; 1 = "consecutive" is typical; 0 = "traditional". -- Galera desires 2; 2 requires BINLOG_FORMAT=ROW or MIXED

( slow_query_log ) = slow_query_log = OFF -- Whether to log slow queries. (5.1.12)

( long_query_time ) = 5 -- Cutoff (Seconds) for defining a "slow" query. -- Suggest 2

( log_slow_slave_statements ) = log_slow_slave_statements = OFF -- (5.6.11, 5.7.1) By default, replicated statements won't show up in the slowlog; this causes them to show. -- It can be helpful in the slowlog to see writes that could be interfering with Replica reads.

( log_queries_not_using_indexes ) = log_queries_not_using_indexes = ON -- Whether to include such in slowlog. -- This clutters the slowlog; turn it off so you can see the real slow queries. And decrease long_query_time (now 5) to catch most interesting queries.

( back_log ) = 60 -- (Autosized as of 5.6.6; based on max_connections) -- Raising to min(150, max_connections (now 25)) may help when doing lots of connections.

( Max_used_connections / max_connections ) = 26 / 25 = 104.0% -- Peak % of connections -- increase max_connections (now 25) and/or decrease wait_timeout (now 180)

( max_connect_errors ) = 1,000,000 = 1.0e+6 -- A small protection against hackers. -- Perhaps no more than 200.

( thread_cache_size / Max_used_connections ) = 100 / 26 = 384.6% -- There is no advantage in having the thread cache bigger than your likely number of connections. Wasting space is the disadvantage.

Abnormally small:

Open_files = 0
interactive_timeout = 180
wait_timeout = 180

Abnormally large:

Com_check = 1.4 /HR
Com_delete_multi = 16 /HR
Com_rename_table = 0.32 /HR
Com_show_databases = 6.5 /HR
Com_show_fields = 2.6 /sec
Com_show_open_tables = 0.029 /HR
Com_show_profile = 9 /HR
Com_show_profiles = 0.3 /HR
Com_show_status = 1.9 /sec
Com_stmt_send_long_data = 24 /HR
Open_table_definitions = 15,451
Performance_schema_file_instances_lost = 514
back_log / max_connections = 240.0%
max_user_connections = 2.15e+9
table_definition_cache = 20,000
table_open_cache / max_connections = 40

Abnormal strings:

ft_boolean_syntax = + -><()~*:
ft_min_word_len = 3
innodb_fast_shutdown = 1
innodb_use_native_aio = OFF
low_priority_updates = ON
optimizer_trace = enabled=off,one_line=off
optimizer_trace_features = greedy_search=on, range_optimizer=on, dynamic_range=on, repeated_subselect=on
slave_rows_search_algorithms = TABLE_SCAN,INDEX_SCAN
time_zone = -05:00