I have a machine with a large MySQL 5.6 database (multiple tables with 10-100 million rows). It has a considerable amount of load, especially during the evening and is running on a 16 core machine.
No matter what the load is though, we always get these intermittent spikes that can cause problems. Our CPU load looks like this:
(these are very load-light times, especially the time before 6am, should be basically "idle")
The only solution to the problem I have found so far is setting up a new server, mirroring the data and switching to that one. This usually buys me about 2-3 months, then the spikes start appearing again. Just restarting MySQL or rebooting the server does not change anything.
These are also not caused by cronjobs. Even if I disable all of them, this still happens.
Here is a gist of the InnoDB status right now:
https://gist.github.com/fleshgolem/de1d4a661fb545fabfda
And here is a dump of the server variables:
Variable_name Value
auto_increment_increment 1
auto_increment_offset 1
autocommit ON
automatic_sp_privileges ON
back_log 200
basedir /usr
big_tables OFF
bind_address *
binlog_cache_size 32768
binlog_checksum CRC32
binlog_direct_non_transactional_updates OFF
binlog_format ROW
binlog_max_flush_queue_time 0
binlog_order_commits ON
binlog_row_image FULL
binlog_rows_query_log_events OFF
binlog_stmt_cache_size 32768
block_encryption_mode aes-128-ecb
bulk_insert_buffer_size 8388608
character_set_client utf8
character_set_connection utf8
character_set_database latin1
character_set_filesystem binary
character_set_results utf8
character_set_server latin1
character_set_system utf8
character_sets_dir /usr/share/mysql/charsets/
collation_connection utf8_general_ci
collation_database latin1_swedish_ci
collation_server latin1_swedish_ci
completion_type NO_CHAIN
concurrent_insert AUTO
connect_timeout 10
core_file OFF
datadir /var/lib/mysql/
date_format %Y-%m-%d
datetime_format %Y-%m-%d %H:%i:%s
default_storage_engine InnoDB
default_tmp_storage_engine InnoDB
default_week_format 0
delay_key_write ON
delayed_insert_limit 100
delayed_insert_timeout 300
delayed_queue_size 1000
disconnect_on_expired_password ON
div_precision_increment 4
end_markers_in_json OFF
enforce_gtid_consistency ON
eq_range_index_dive_limit 10
error_count 0
event_scheduler OFF
expire_logs_days 0
explicit_defaults_for_timestamp OFF
external_user
flush OFF
flush_time 0
foreign_key_checks ON
ft_boolean_syntax + -><()~*:""&|
ft_max_word_len 84
ft_min_word_len 4
ft_query_expansion_limit 20
ft_stopword_file (built-in)
general_log OFF
general_log_file /var/lib/mysql/xxxx.log
group_concat_max_len 1024
gtid_executed
gtid_mode ON
gtid_next AUTOMATIC
gtid_owned
gtid_purged
have_compress YES
have_crypt YES
have_dynamic_loading YES
have_geometry YES
have_openssl DISABLED
have_profiling YES
have_query_cache YES
have_rtree_keys YES
have_ssl DISABLED
have_symlink DISABLED
host_cache_size 640
hostname xxxx
identity 0
ignore_builtin_innodb OFF
ignore_db_dirs
init_connect
init_file
init_slave
innodb_adaptive_flushing ON
innodb_adaptive_flushing_lwm 10
innodb_adaptive_hash_index ON
innodb_adaptive_max_sleep_delay 150000
innodb_additional_mem_pool_size 8388608
innodb_api_bk_commit_interval 5
innodb_api_disable_rowlock OFF
innodb_api_enable_binlog OFF
innodb_api_enable_mdl OFF
innodb_api_trx_level 0
innodb_autoextend_increment 64
innodb_autoinc_lock_mode 1
innodb_buffer_pool_dump_at_shutdown OFF
innodb_buffer_pool_dump_now OFF
innodb_buffer_pool_filename ib_buffer_pool
innodb_buffer_pool_instances 8
innodb_buffer_pool_load_abort OFF
innodb_buffer_pool_load_at_startup OFF
innodb_buffer_pool_load_now OFF
innodb_buffer_pool_size 42949672960
innodb_change_buffer_max_size 25
innodb_change_buffering all
innodb_checksum_algorithm innodb
innodb_checksums ON
innodb_cmp_per_index_enabled OFF
innodb_commit_concurrency 0
innodb_compression_failure_threshold_pct 5
innodb_compression_level 6
innodb_compression_pad_pct_max 50
innodb_concurrency_tickets 5000
innodb_data_file_path ibdata1:12M:autoextend
innodb_data_home_dir
innodb_disable_sort_file_cache OFF
innodb_doublewrite OFF
innodb_fast_shutdown 1
innodb_file_format Antelope
innodb_file_format_check ON
innodb_file_format_max Antelope
innodb_file_per_table ON
innodb_flush_log_at_timeout 1
innodb_flush_log_at_trx_commit 2
innodb_flush_method O_DIRECT
innodb_flush_neighbors 0
innodb_flushing_avg_loops 30
innodb_force_load_corrupted OFF
innodb_force_recovery 0
innodb_ft_aux_table
innodb_ft_cache_size 8000000
innodb_ft_enable_diag_print OFF
innodb_ft_enable_stopword ON
innodb_ft_max_token_size 84
innodb_ft_min_token_size 3
innodb_ft_num_word_optimize 2000
innodb_ft_result_cache_limit 2000000000
innodb_ft_server_stopword_table
innodb_ft_sort_pll_degree 2
innodb_ft_total_cache_size 640000000
innodb_ft_user_stopword_table
innodb_io_capacity 200
innodb_io_capacity_max 2000
innodb_large_prefix OFF
innodb_lock_wait_timeout 50
innodb_locks_unsafe_for_binlog OFF
innodb_log_buffer_size 8388608
innodb_log_compressed_pages ON
innodb_log_file_size 104857600
innodb_log_files_in_group 2
innodb_log_group_home_dir ./
innodb_lru_scan_depth 1024
innodb_max_dirty_pages_pct 75
innodb_max_dirty_pages_pct_lwm 0
innodb_max_purge_lag 0
innodb_max_purge_lag_delay 0
innodb_mirrored_log_groups 1
innodb_monitor_disable
innodb_monitor_enable
innodb_monitor_reset
innodb_monitor_reset_all
innodb_old_blocks_pct 37
innodb_old_blocks_time 1000
innodb_online_alter_log_max_size 134217728
innodb_open_files 2000
innodb_optimize_fulltext_only OFF
innodb_page_size 16384
innodb_print_all_deadlocks OFF
innodb_purge_batch_size 300
innodb_purge_threads 1
innodb_random_read_ahead OFF
innodb_read_ahead_threshold 56
innodb_read_io_threads 16
innodb_read_only OFF
innodb_replication_delay 0
innodb_rollback_on_timeout OFF
innodb_rollback_segments 128
innodb_sort_buffer_size 1048576
innodb_spin_wait_delay 6
innodb_stats_auto_recalc ON
innodb_stats_method nulls_equal
innodb_stats_on_metadata OFF
innodb_stats_persistent ON
innodb_stats_persistent_sample_pages 20
innodb_stats_sample_pages 8
innodb_stats_transient_sample_pages 8
innodb_status_output OFF
innodb_status_output_locks OFF
innodb_strict_mode OFF
innodb_support_xa ON
innodb_sync_array_size 1
innodb_sync_spin_loops 30
innodb_table_locks ON
innodb_thread_concurrency 0
innodb_thread_sleep_delay 10000
innodb_undo_directory .
innodb_undo_logs 128
innodb_undo_tablespaces 0
innodb_use_native_aio ON
innodb_use_sys_malloc ON
innodb_version 5.6.19
innodb_write_io_threads 16
insert_id 0
interactive_timeout 28800
join_buffer_size 262144
keep_files_on_create OFF
key_buffer_size 8388608
key_cache_age_threshold 300
key_cache_block_size 1024
key_cache_division_limit 100
large_files_support ON
large_page_size 0
large_pages OFF
last_insert_id 0
lc_messages en_US
lc_messages_dir /usr/share/mysql/
lc_time_names en_US
license GPL
local_infile ON
lock_wait_timeout 31536000
locked_in_memory OFF
log_bin ON
log_bin_basename /var/lib/mysql/xxxx-db1-bin
log_bin_index /var/lib/mysql/xxxx-db1-bin.index
log_bin_trust_function_creators OFF
log_bin_use_v1_row_events OFF
log_error /var/log/mysqld.log
log_output FILE
log_queries_not_using_indexes OFF
log_slave_updates ON
log_slow_admin_statements OFF
log_slow_slave_statements OFF
log_throttle_queries_not_using_indexes 0
log_warnings 1
long_query_time 10.000000
low_priority_updates OFF
lower_case_file_system OFF
lower_case_table_names 0
master_info_repository TABLE
master_verify_checksum OFF
max_allowed_packet 4194304
max_binlog_cache_size 18446744073709547520
max_binlog_size 1073741824
max_binlog_stmt_cache_size 18446744073709547520
max_connect_errors 100
max_connections 750
max_delayed_threads 20
max_error_count 64
max_heap_table_size 16777216
max_insert_delayed_threads 20
max_join_size 18446744073709551615
max_length_for_sort_data 1024
max_prepared_stmt_count 16382
max_relay_log_size 0
max_seeks_for_key 18446744073709551615
max_sort_length 1024a
max_sp_recursion_depth 0
max_tmp_tables 32
max_user_connections 0
max_write_lock_count 18446744073709551615
metadata_locks_cache_size 1024
metadata_locks_hash_instances 8
min_examined_row_limit 0
multi_range_count 256
myisam_data_pointer_size 6
myisam_max_sort_file_size 9223372036853727232
myisam_mmap_size 18446744073709551615
myisam_recover_options OFF
myisam_repair_threads 1
myisam_sort_buffer_size 8388608
myisam_stats_method nulls_unequal
myisam_use_mmap OFF
net_buffer_length 16384
net_read_timeout 30
net_retry_count 10
net_write_timeout 60
new OFF
old OFF
old_alter_table OFF
old_passwords 0
open_files_limit 5000
optimizer_prune_level 1
optimizer_search_depth 62
optimizer_switch index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,engine_condition_pushdown=on,index_condition_pushdown=on,mrr=on,mrr_cost_based=on,block_nested_loop=on,batched_key_access=off,materialization=on,semijoin=on,loosescan=on,firstmatch=on,subquery_materialization_cost_based=on,use_index_extensions=on
optimizer_trace enabled=off,one_line=off
optimizer_trace_features greedy_search=on,range_optimizer=on,dynamic_range=on,repeated_subselect=on
optimizer_trace_limit 1
optimizer_trace_max_mem_size 16384
optimizer_trace_offset -1
performance_schema ON
performance_schema_accounts_size 100
performance_schema_digests_size 10000
performance_schema_events_stages_history_long_size 10000
performance_schema_events_stages_history_size 10
performance_schema_events_statements_history_long_size 10000
performance_schema_events_statements_history_size 10
performance_schema_events_waits_history_long_size 10000
performance_schema_events_waits_history_size 10
performance_schema_hosts_size 100
performance_schema_max_cond_classes 80
performance_schema_max_cond_instances 5900
performance_schema_max_file_classes 50
performance_schema_max_file_handles 32768
performance_schema_max_file_instances 7693
performance_schema_max_mutex_classes 200
performance_schema_max_mutex_instances 19500
performance_schema_max_rwlock_classes 40
performance_schema_max_rwlock_instances 10300
performance_schema_max_socket_classes 10
performance_schema_max_socket_instances 1520
performance_schema_max_stage_classes 150
performance_schema_max_statement_classes 168
performance_schema_max_table_handles 4000
performance_schema_max_table_instances 12500
performance_schema_max_thread_classes 50
performance_schema_max_thread_instances 1600
performance_schema_session_connect_attrs_size 512
performance_schema_setup_actors_size 100
performance_schema_setup_objects_size 100
performance_schema_users_size 100
pid_file /var/run/mysqld/mysqld.pid
plugin_dir /usr/lib64/mysql/plugin/
port 3306
preload_buffer_size 32768
profiling OFF
profiling_history_size 15
protocol_version 10
proxy_user
pseudo_slave_mode OFF
pseudo_thread_id 2018
query_alloc_block_size 8192
query_cache_limit 1048576
query_cache_min_res_unit 4096
query_cache_size 1048576000
query_cache_type ON
query_cache_wlock_invalidate OFF
query_prealloc_size 8192
rand_seed1 0
rand_seed2 0
range_alloc_block_size 4096
read_buffer_size 131072
read_only OFF
read_rnd_buffer_size 262144
relay_log
relay_log_basename
relay_log_index
relay_log_info_file relay-log.info
relay_log_info_repository TABLE
relay_log_purge ON
relay_log_recovery OFF
relay_log_space_limit 0
report_host 10.129.156.251
report_password
report_port 3306
report_user
rpl_stop_slave_timeout 31536000
secure_auth ON
secure_file_priv
server_id 1
server_id_bits 32
server_uuid 96bcdb70-44c5-11e5-8df9-0401654ee301
skip_external_locking ON
skip_name_resolve OFF
skip_networking OFF
skip_show_database OFF
slave_allow_batching OFF
slave_checkpoint_group 512
slave_checkpoint_period 300
slave_compressed_protocol OFF
slave_exec_mode STRICT
slave_load_tmpdir /tmp
slave_max_allowed_packet 1073741824
slave_net_timeout 3600
slave_parallel_workers 0
slave_pending_jobs_size_max 16777216
slave_rows_search_algorithms TABLE_SCAN,INDEX_SCAN
slave_skip_errors ALL
slave_sql_verify_checksum ON
slave_transaction_retries 10
slave_type_conversions
slow_launch_time 2
slow_query_log OFF
slow_query_log_file /var/lib/mysql/xxxx-slow.log
socket /var/lib/mysql/mysql.sock
sort_buffer_size 262144
sql_auto_is_null OFF
sql_big_selects ON
sql_buffer_result OFF
sql_log_bin ON
sql_log_off OFF
sql_mode STRICT_TRANS_TABLES,NO_ENGINE_SUBSTITUTION
sql_notes ON
sql_quote_show_create ON
sql_safe_updates OFF
sql_select_limit 18446744073709551615
sql_slave_skip_counter 0
sql_warnings OFF
ssl_ca
ssl_capath
ssl_cert
ssl_cipher
ssl_crl
ssl_crlpath
ssl_key
storage_engine InnoDB
stored_program_cache 256
sync_binlog 0
sync_frm ON
sync_master_info 1
sync_relay_log 10000
sync_relay_log_info 10000
system_time_zone EST
table_definition_cache 1400
table_open_cache 2000
table_open_cache_instances 1
thread_cache_size 15
thread_concurrency 10
thread_handling one-thread-per-connection
thread_stack 262144
time_format %H:%i:%s
time_zone SYSTEM
timed_mutexes OFF
timestamp 1457343545.452900
tmp_table_size 16777216
tmpdir /tmp
transaction_alloc_block_size 8192
transaction_allow_batching OFF
transaction_prealloc_size 4096
tx_isolation REPEATABLE-READ
tx_read_only OFF
unique_checks ON
updatable_views_with_limit YES
version 5.6.19-log
version_comment MySQL Community Server (GPL)
version_compile_machine x86_64
version_compile_os Linux
wait_timeout 28800
warning_count 0
If there is any more relevant info you need, please let me know and I will update the question accordingly. I unfortunately have no clue anymore what I am looking for.
Best Answer
This is probably it:
What is happening: a write comes in, the entire GB of QC needs to be scanned to find all instances of that table to purge them. That takes a lot of CPU time. Meanwhile, all
SELECTs
are blocked.Do not set the size bigger than about 50M, regardless of how much RAM you have. It would probably be wise to also use
DYNAMIC
instead ofON
, and hand-pick whichSELECTs
to haveSQL_CACHE
and which to haveSQL_NO_CACHE
.Or it may be that the QC is not worth having on at all. This is the common case for Production systems that have constant write traffic.
More...
Based on VARIABLES and GLOBAL STATUS
Observations:
Version: 5.6.19-log
50 GB of RAM
Uptime = 11d 07:07:58
You are not running on Windows.
Running 64-bit version
It appears that you are running both MyISAM and InnoDB.
The More Important Issues
Either convert completely to InnoDB or tweak the cache sizes. SUggest
query_cache_size is really bad at 1000M. Your usage is moderately effective, so consider:
SQL_CACHE
orSQL_NO_CACHE
to all `SELECTs, based on which ones are likely to benefit,query_cach_size
to only100m
A lot of queries are using tmp tables and, worse, disk tmp tables. Using the slowlog, find out which queries are the most invasive; let's work on them.
Raise
tmp_table_size
andmax_heap_table_size
from 16M to 32M (but no more). Since there are two ways that tmp tables can turn into 'disk tmp tables', this might prevent some conversions.slave_skip_errors = ALL
-- Sweeping problems under the rug. Big time!Details and other observations
( Innodb_buffer_pool_pages_free * 16384 / innodb_buffer_pool_size ) = 1,864,255 * 16384 / 42949672960 = 71.1% -- buffer pool free -- buffer_pool_size is bigger than working set; could decrease it
( Innodb_log_writes ) = 40,597,718 / 976078 = 42 /sec
( Com_rollback ) = 99,316,489 / 976078 = 101 /sec -- ROLLBACKs in InnoDB. -- An excessive frequency of rollbacks may indicate inefficient app logic.
( local_infile ) = ON -- local_infile = ON is a potential security issue
( Key_writes / Key_write_requests ) = 2,200,386 / 4113735 = 53.5% -- key_buffer effectiveness for writes -- If you have enough RAM, it would be worthwhile to increase key_buffer_size.
( query_cache_size ) = 1000M -- Size of QC -- Too small = not of much use. Too large = too much overhead. Recommend either 0 or no more than 50M.
( Qcache_not_cached ) = 235,136,200 / 976078 = 240 /sec -- SQL_CACHE attempted, but ignored -- Rethink caching; tune qcache
( Qcache_inserts - Qcache_queries_in_cache ) = (258097280 - 7736) / 976078 = 264 /sec -- Invalidations/sec.
( (query_cache_size - Qcache_free_memory) / Qcache_queries_in_cache / query_alloc_block_size ) = (1000M - 11519952) / 7736 / 8192 = 16.4 -- query_alloc_block_size vs formula -- Adjust query_alloc_block_size
( Created_tmp_tables ) = 31,198,441 / 976078 = 32 /sec -- Frequency of creating "temp" tables as part of complex SELECTs.
( Created_tmp_disk_tables ) = 5,996,371 / 976078 = 6.1 /sec -- Frequency of creating disk "temp" tables as part of complex SELECTs -- increase tmp_table_size and max_heap_table_size. Check the rules for temp tables being able to use MEMORY instead of MyISAM. It may be possible to make a minor schema or query change to avoid MyISAM. Better indexes and reformulation of queries are more likely to help.
( Handler_read_rnd_next ) = 1,066,165,445,800 / 976078 = 1092295 /sec -- High if lots of table scans -- possibly inadequate keys
( Com_rollback / Com_commit ) = 99,316,489 / 49548802 = 200.4% -- Rollback : Commit ratio -- Rollbacks are costly; change app logic
( Select_scan ) = 16,213,392 / 976078 = 17 /sec -- full table scans -- Add indexes / optimize queries (unless they are tiny tables)
( Com_insert + Com_delete + Com_delete_multi + Com_replace + Com_update + Com_update_multi ) = (57757683 + 26581027 + 0 + 0 + 34482709 + 0) / 976078 = 121 /sec -- writes/sec -- 50 writes/sec + log flushes will probably max out I/O write capacity of normal drives
( expire_logs_days ) = 0 -- How soon to automatically purge binlog (after this many days) -- Too large (or zero) = consumes disk space; too small = need to respond quickly to network/machine crash. (Not relevant if log_bin = OFF)
( slow_query_log ) = OFF -- Whether to log slow queries. (5.1.12)
( long_query_time ) = 10.000000 = 10 -- Cutoff (Seconds) for defining a "slow" query. -- Suggest 2
( Aborted_clients / Connections ) = 33,444 / 45497 = 73.5% -- Threads bumped due to timeout -- Increase wait_timeout; be nice, use disconnect
( Threads_created / Connections ) = 3,675 / 45497 = 8.1% -- Rapidity of process creation -- Increase thread_cache_size
innodb_log_file_size is small (but hard to change).
Good caching in buffer_pool.
Good caching of table_definitions.
Com_delete = 27/sec
Any swapping?
GTID -- is this Master?