MySQL join has drastically worse performance after upgrading from 5.6 to 5.7

amazon-rdsawsMySQLmysql-5.6mysql-5.7

I have a fairly straightforward query that has become very slow after upgrading from MySQL 5.6.35 to MySQL 5.7.12 (running on AWS RDS).

SELECT DISTINCT 
  Name,d.id,deviceType,issuedBy, description,avNum,CompanyName,
  BrandName,dwNumber,quant,discDate,Type
FROM table_one d     
JOIN table_two i ON d.id = i.id;

In 5.6, this query completes in 90 seconds. After upgrading to 5.7, it’s taking over 30 minutes. Each table has approximately 2 million rows.

I started by comparing the optimizer settings between 5.6 and 5.7:

# 5.6
index_merge=on
index_merge_union=on
index_merge_sort_union=on
index_merge_intersection=on
engine_condition_pushdown=on
index_condition_pushdown=on
mrr=on
mrr_cost_based=on
block_nested_loop=on
batched_key_access=off
materialization=on
semijoin=on
loosescan=on
firstmatch=on
subquery_materialization_cost_based=on
use_index_extensions=on
# 5.7
Optimizer settings in 5.7:
index_merge=on
index_merge_union=on
index_merge_sort_union=on
index_merge_intersection=on
engine_condition_pushdown=on
index_condition_pushdown=on
mrr=on
mrr_cost_based=on
block_nested_loop=on
batched_key_access=off
materialization=on
semijoin=on
loosescan=on
firstmatch=on
subquery_materialization_cost_based=on
use_index_extensions=on
condition_fanout_filter=on
derived_merge=on
duplicateweedout=on

The only changes I see were the last three options in 5.7. I switched them off as follows:

SET optimizer_switch='condition_fanout_filter=off';
SET optimizer_switch='derived_merge=off';
SET optimizer_switch='duplicateweedout=off';

This had no effect. Next, I looked at the differences in my innodb settings (my 5.7 instance has more memory, hence the pool size differences).

# my innodb changes 5.6 -> 5.7
innodb_adaptive_hash_index_parts added in 5.7, set to ‘8’
innodb_additional_mem_pool_size set to 8388608 in 5.6, removed in 5.7
innodb_buffer_pool_dump_at_shutdown changed from ‘OFF’ to ‘ON’
innodb_buffer_pool_dump_pct added in 5.7, set to ’25’
innodb_buffer_pool_load_at_startup changed from ‘OFF’ to ‘ON’
innodb_buffer_pool_size changed from 2,804,940,800 to 11,811,160,064
innodb_checksum_algorithm changed from ‘innodb’ to ‘crc32’
innodb_deadlock_detect added in 5.7, set to ‘ON’
innodb_default_row_format added in 5.7, set to ‘dynamic’
innodb_file_format changed from ‘Antelope’ to ‘Barracuda’
innodb_file_format_max changed from ‘Antelope’ to ‘Barracuda’
innodb_fill_factor added in 5.7 set to ‘100’
innodb_flush_sync added in 5.7 set to ‘ON’
innodb_log_checksums added in 5.7 set to ‘ON’
innodb_log_write_ahead_size added in 5.7 set to 8192
innodb_max_undo_log_size added in 5.7 set to 1,073,741,824
innodb_mirrored_log_groups removed in 5.7, set to ‘1’ in 5.6
innodb_numa_interleave added in 5.7 set to ‘OFF’
innodb_page_cleaners added in 5.7 set to ‘4’
innodb_purge_rseg_truncate_frequency added in 5.7 set to ‘128’
innodb_strict_mode changed from ‘OFF’ to ‘ON’
innodb_temp_data_file_path added in 5.7, set to 'ibtmp1:12M:autoextend'
innodb_undo_log_truncate added in 5.7, set to ‘OFF’

I reviewed the reference manual
and attempted to switch off the following:

innodb_strict_mode=‘OFF’
innodb_deadlock_detect=‘OFF’
innodb_flush_sync=‘OFF’
innodb_log_checksums=‘OFF’

Also, no effect.

I’m hoping someone might be able to tell me what is going on here? The query is part of a larger workflow, which has become unusable as a result. I'd like to get the 5.6 behavior back if possible, but with my limited DBA knowledge, I'm running out of things to try. I'm hoping someone can point me in the right direction or give me additional avenues to investigate.

Below are my tables. Many of the columns have large widths, and they are mostly defined as VARCHAR(X) DEFAULT NULL because the data source isn't clean (I don't have control over the source). Note that I need utf8 data, but I am using latin1 while I troubleshoot the slow query, to eliminate that as a possible cause of the slowness. (The reason we need to migrate from 5.6 -> 5.7 is for the larger index sizes in 5.7, which can accommodate our larger UTF8 column sizes).

(Also, I should point out that the id field is coming from the source data, and is a long string of alphanumeric characters of variable width (up to ~30 characters), so I need it to be a VARCHAR).

CREATE TABLE `table_one` (
  `id` varchar(200) NOT NULL DEFAULT '',
  `RecordKey` varchar(50) DEFAULT NULL,
  `VersionStat` varchar(200) DEFAULT NULL,
  `Status` varchar(200) DEFAULT NULL,
  `VersionNumber` varchar(50) DEFAULT NULL,
  `VersionDate` varchar(20) DEFAULT NULL,
  `PublishDate` varchar(20) DEFAULT NULL,
  `DistStart` varchar(20) DEFAULT NULL,
  `DistCommStat` varchar(2000) DEFAULT NULL,
  `BrandName` varchar(100) DEFAULT NULL,
  `VersionModelNumber` varchar(100) DEFAULT NULL,
  `Catalog` varchar(100) DEFAULT NULL,
  `dwNumber` varchar(100) DEFAULT NULL,
  `CompanyName` varchar(500) DEFAULT NULL,
  `DeviceCount` varchar(200) DEFAULT NULL,
  `description` varchar(3000) DEFAULT NULL,
  `Exemption` varchar(1100) DEFAULT NULL,
  `PreMarket` varchar(1500) DEFAULT NULL,
  `DevDRMT` varchar(1000) DEFAULT NULL,
  `DTKit` varchar(200) DEFAULT NULL,
  `Combination` varchar(250) DEFAULT NULL,
  `Usage` varchar(500) DEFAULT NULL,
  `SingleBatch` varchar(50) DEFAULT NULL,
  `SerialNumber` varchar(250) DEFAULT NULL,
  `ManuDate` varchar(20) DEFAULT NULL,
  `ExpDate` varchar(20) DEFAULT NULL,
  `Donation` varchar(50) DEFAULT NULL,
  `LabeldWithMLO` varchar(50) DEFAULT NULL,
  `NLabledMLO` varchar(50) DEFAULT NULL,
  `MLOStatus` varchar(1000) DEFAULT NULL,
  `BTT` varchar(50) DEFAULT NULL,
  `OPP` varchar(50) DEFAULT NULL,
  `BRC` varchar(50) DEFAULT NULL,
  `PriorUse` varchar(100) DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
CREATE TABLE `table_two` (
  `id` varchar(200) DEFAULT NULL,
  `Name` varchar(200) DEFAULT NULL,
  `deviceType` varchar(50) DEFAULT NULL,
  `issuedBy` varchar(200) DEFAULT NULL,
  `avNum` varchar(50) DEFAULT NULL,
  `quant` varchar(50) DEFAULT NULL,
  `discDate` varchar(50) DEFAULT NULL,
  `PkgStatus` varchar(50) DEFAULT NULL,
  `Type` varchar(50) DEFAULT NULL,
  KEY `idx_table_two_id` (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1

EXPLAIN is identical between the two as well:

explain extended SELECT DISTINCT ...

******************** 1. row *********************
           id: 1
  select_type: SIMPLE
        table: d
         type: ALL
possible_keys: PRIMARY
          key: 
      key_len: 
          ref: 
         rows: 1596593
     filtered: 100.00
        Extra: Using temporary
******************** 2. row *********************
           id: 1
  select_type: SIMPLE
        table: i
         type: ref
possible_keys: idx_table_two_id
          key: idx_table_two_id
      key_len: 203
          ref: mydb.d.id
         rows: 1
     filtered: 100.00
        Extra: 

UPDATE #1:

Here is some more information. I enabled the performance schema on both servers. The 5.7 server shows many hash_table_locks:

SELECT EVENT_NAME, COUNT_STAR
       FROM performance_schema.events_waits_summary_global_by_event_name
       ORDER BY COUNT_STAR DESC LIMIT 10;


# EVENT_NAME, COUNT_STAR
'wait/synch/sxlock/innodb/hash_table_locks',   '3256808433'
'wait/synch/mutex/innodb/buf_pool_mutex',      '19266748'
'wait/synch/mutex/innodb/log_sys_mutex',       '14488781'
'wait/io/table/sql/handler',                   '13676918'
'wait/synch/mutex/innodb/lock_mutex',          '11431841'
...

Additionally, I see a lot of time being spent in wait/io/table/sql/handler, presumably because of the hash_table_locks. They are the top two consumers of time (in picoseconds):

SELECT EVENT_NAME, SUM_TIMER_WAIT
       FROM performance_schema.events_waits_summary_global_by_event_name where event_name != "idle"
       ORDER BY SUM_TIMER_WAIT DESC LIMIT 10;

# EVENT_NAME, SUM_TIMER_WAIT
'wait/io/table/sql/handler',                 '1300909619487480' # 1300s
'wait/synch/sxlock/innodb/hash_table_locks', '98099101074540'   # 98s
'wait/io/file/innodb/innodb_data_file',      '5035505718525'
'wait/io/socket/sql/client_connection',      '344937541275'
'wait/synch/mutex/innodb/fil_system_mutex',  '198749837865'
...

When I repeat this on the 5.6 server, I don't see the hash_table_locks consuming time like this.

UPDATE #2:

To check if the disk is the bottleneck, I did the following.

I installed two identical EC2 instances. They are both i3.large (2 vCPU / 15.25 GB RAM) along with 1×425 GB SSD disk. On the first instance, I installed MySQL 5.7.25. On the second instance, I installed MariaDB 10.2.21. I kept the out of the box configuration for both. The query on MySQL still takes 30 minutes, but the MariaDB instance only takes 30 seconds! The versions of InnoDB are nearly identical: MySQL is running 5.7.25, and MariaDB is running 5.7.24. It seems to me this must be a configuration issue within MySQL, rather than a limitation of Innodb or the disk.

Another update: within the MariaDB instance, the EXPLAIN is slightly different. There is a "using where" clause, and the tables are reversed. I attempted a STRAIGHT_JOIN to change the table order, but that didn't change anything:

# MariaDB
EXPLAIN SELECT SQL_NO_CACHE DISTINCT
  Name,d.id,deviceType,issuedBy, description,avNum,CompanyName,
  BrandName,dwNumber,quant,discDate,Type
FROM table_one d    
JOIN table_two i ON d.id = i.id;


******************** 1. row *********************
           id: 1
  select_type: SIMPLE
        table: i
         type: ALL
possible_keys: idx_table_two_id
          key:
      key_len:
          ref:
         rows: 2496908
        Extra: Using where; Using temporary
******************** 2. row *********************
           id: 1
  select_type: SIMPLE
        table: d
         type: eq_ref
possible_keys: PRIMARY
          key: PRIMARY
      key_len: 202
          ref: mydb.i.id
         rows: 1
        Extra:
2 rows in set

I've also applied Wilson's suggested changes below, which have not resolved the issue.

UPDATE #3:

I've found something that looks significant. I enabled the query tracer on my 5.6 and 5.7 servers. I've attached their output here:

5.6: https://pastebin.com/KSTeTDdy

5.7: https://pastebin.com/3SaXNdCU

At the very end, I noticed the temporary tables are created differently:

# 5.6
{
  "converting_tmp_table_to_myisam": {
    "cause": "memory_table_size_exceeded",
    "tmp_table_info": {
      "table": "intermediate_tmp_table",
      "row_length": 4570,
      "key_length": 4585,
      "unique_constraint": true,
      "location": "disk (MyISAM)",
      "record_format": "packed"
    }
  }
}
# 5.7
{
  "converting_tmp_table_to_ondisk": {
    "cause": "memory_table_size_exceeded",
    "tmp_table_info": {
      "table": "intermediate_tmp_table",
      "row_length": 4575,
      "key_length": 8,
      "unique_constraint": true,
      "location": "disk (InnoDB)",
      "record_format": "packed"
    }
  }
}

5.6 is using MyISAM for the tmp table, and 5.7 is using InnoDB. Maybe this is why I'm seeing so many occurrences of wait/synch/sxlock/innodb/hash_table_locks in 5.7? Could this have to do with InnoDB row level locking? The question is, is it possible to restore the old behavior to test?

I found internal_tmp_disk_storage_engine and changed it to MyISAM on the 5.7 server. That does not resolve the issue either. I can see the temp table being created under /tmp:

# ls -ltrh /tmp
-rw-rw---- 1 mysql mysql  56K Apr 26 02:27 #sql_f55_0.MAI
-rw-rw---- 1 mysql mysql 368M Apr 26 02:27 #sql_f55_0.MAD

It's definitely using MyISAM now, but the query is still slow. I do notice that on all servers, 5.7, 5.6, and MariaDB, this file size goes to 368M and then stops getting larger. The file grows MUCH faster in 5.6/MariaDB (~10-15 MB/s) compared to 5.7 (<1MB/s).

Best Answer

Analysis of VARIABLES and GLOBAL STATUS:
(Alas, nothing here jumps out as an explanation of the Question in hand)

I dislike doing FORCE INDEX, etc, but STRAIGHT_JOIN may be the simplest and safest way to force the ordering of the two tables.

Observations:

  • Version: 5.7.25-log
  • 15.25 GB of RAM
  • Uptime = 14:05:48; some GLOBAL STATUS values may not be meaningful yet.
  • You are not running on Windows.
  • Running 64-bit version
  • You appear to be running entirely (or mostly) InnoDB.

The More Important Issues:

With SSD, change these:

innodb_flush_neighbors = 0 -- from 1 innodb_io_capacity = 800 -- from 200

It sounds like your dataset size is considerably less than innodb_buffer_pool_size. So, perhaps you are paying for more RAM than you need? If you are preparing for growth, then ignore this comment.

With rare exceptions, I don't think innodb_deadlock_detect should be OFF.

A lot of poorly performang queries. Crank down long_query_time and make use of the slowlog: http://mysql.rjweb.org/doc.php/mysql_analysis#slow_queries_and_slowlog

Since only 11 connections seem to be needed, lower max_connections to, say 40. This will avoid certain pressures on RAM.

Are you sure you want READ-UNCOMMITTED?

Details and other observations:

( Table_open_cache_misses / (Table_open_cache_hits + Table_open_cache_misses) ) = 272 / (6879 + 272) = 3.8% -- Effectiveness of table_open_cache. -- Increase table_open_cache and check table_open_cache_instances.

( innodb_lru_scan_depth * innodb_page_cleaners ) = 1,024 * 4 = 4,096 -- Amount of work for page cleaners every second. -- "InnoDB: page_cleaner: 1000ms intended loop took ..." may be fixable by lowering lru_scan_depth: Consider 1000 / innodb_page_cleaners

( innodb_page_cleaners / innodb_buffer_pool_instances ) = 4 / 8 = 0.5 -- innodb_page_cleaners -- Recommend setting innodb_page_cleaners to innodb_buffer_pool_instances

( innodb_lru_scan_depth ) = 1,024 -- "InnoDB: page_cleaner: 1000ms intended loop took ..." may be fixed by lowering lru_scan_depth

( Innodb_buffer_pool_pages_free / Innodb_buffer_pool_pages_total ) = 494,639 / 720896 = 68.6% -- Pct of buffer_pool currently not in use -- innodb_buffer_pool_size is bigger than necessary?

( Innodb_buffer_pool_bytes_data / innodb_buffer_pool_size ) = 3,590,832,128 / 11264M = 30.4% -- Percent of buffer pool taken up by data -- A small percent may indicate that the buffer_pool is unnecessarily big.

( Uptime / 60 * innodb_log_file_size / Innodb_os_log_written ) = 50,748 / 60 * 128M / 2513408 = 45,166 -- Minutes between InnoDB log rotations Beginning with 5.6.8, this can be changed dynamically; be sure to also change my.cnf. -- (The recommendation of 60 minutes between rotations is somewhat arbitrary.) Adjust innodb_log_file_size. (Cannot change in AWS.)

( innodb_flush_neighbors ) = 1 -- A minor optimization when writing blocks to disk. -- Use 0 for SSD drives; 1 for HDD.

( innodb_io_capacity ) = 200 -- I/O ops per second capable on disk . 100 for slow drives; 200 for spinning drives; 1000-2000 for SSDs; multiply by RAID factor.

( innodb_thread_concurrency ) = 0 -- 0 = Let InnoDB decide the best for concurrency_tickets. -- Set to 0 or 64. This may cut back on CPU.

( innodb_print_all_deadlocks ) = innodb_print_all_deadlocks = OFF -- Whether to log all Deadlocks. -- If you are plagued with Deadlocks, turn this on. Caution: If you have lots of deadlocks, this may write a lot to disk.

( innodb_deadlock_detect ) = innodb_deadlock_detect = OFF -- 5.7.15 provides way to speed up heavy inserting by turning off deadlock detection -- Do you really want to turn it OFF?

( local_infile ) = local_infile = ON -- local_infile = ON is a potential security issue

( Created_tmp_disk_tables / Questions ) = 37,591 / 73621 = 51.1% -- Pct of queries that needed on-disk tmp table. -- Better indexes / No blobs / etc.

( Created_tmp_disk_tables / Created_tmp_tables ) = 37,591 / 46560 = 80.7% -- Percent of temp tables that spilled to disk -- Maybe increase tmp_table_size and max_heap_table_size; improve indexes; avoid blobs, etc.

( (Com_insert + Com_update + Com_delete + Com_replace) / Com_commit ) = (168 + 0 + 2 + 0) / 170 = 1 -- Statements per Commit (assuming all InnoDB) -- Low: Might help to group queries together in transactions; High: long transactions strain various things.

( Select_scan ) = 51,492 / 50748 = 1 /sec -- full table scans -- Add indexes / optimize queries (unless they are tiny tables)

( Select_scan / Com_select ) = 51,492 / 71543 = 72.0% -- % of selects doing full table scan. (May be fooled by Stored Routines.) -- Add indexes / optimize queries

( ( Com_stmt_prepare - Com_stmt_close ) / ( Com_stmt_prepare + Com_stmt_close ) ) = ( 2 - 0 ) / ( 2 + 0 ) = 100.0% -- Are you ( binlog_format ) = binlog_format = MIXED -- STATEMENT/ROW/MIXED. ROW is preferred; it may become the default.

( expire_logs_days ) = 0 -- How soon to automatically purge binlog (after this many days) -- Too large (or zero) = consumes disk space; too small = need to respond quickly to network/machine crash.

( slow_query_log ) = slow_query_log = OFF -- Whether to log slow queries. (5.1.12)

( long_query_time ) = 10 -- Cutoff (Seconds) for defining a "slow" query. -- Suggest 2

( Threads_created / Connections ) = 11 / 197 = 5.6% -- Rapidity of process creation -- Increase thread_cache_size (non-Windows)

( thread_cache_size / max_connections ) = 21 / 1320 = 1.6% -- (0 for Windows)

( thread_cache_size / Max_used_connections ) = 21 / 11 = 190.9% -- There is no advantage in having the thread cache bigger than your likely number of connections. Wasting space is the disadvantage.

You have the Query Cache half-off. You should set both query_cache_type = OFF and query_cache_size = 0 . There is (according to a rumor) a 'bug' in the QC code that leaves some code on unless you turn off both of those settings.

Abnormally small:

Com_set_option = 6.7 /HR
Handler_read_rnd = 0.85 /HR
Innodb_dblwr_pages_written / Innodb_dblwr_writes = 1.05
Select_range = 0
Select_range / Com_select = 0
Sort_rows = 0.85 /HR
Table_locks_immediate = 0.03 /sec
Table_open_cache_hits = 0.14 /sec

Abnormally large:

(Com_select + Qcache_hits) / (Com_insert + Com_update + Com_delete + Com_replace) = 420
Com_alter_user = 0.071 /HR
Com_flush = 12 /HR
Com_purge = 12 /HR
Com_release_savepoint = 0.071 /HR
Com_savepoint = 0.071 /HR
Handler_read_next / Handler_read_key = 145
Handler_read_rnd_next / Handler_read_rnd = 2.43e+6
Handler_savepoint = 0.071 /HR
Handler_savepoint_rollback = 0.28 /HR
Innodb_buffer_pool_pages_flushed / max(Questions, Queries) = 1.24
Performance_schema_file_instances_lost = 2

Abnormal strings:

ft_boolean_syntax = + -><()~*:&
gtid_mode = OFF_PERMISSIVE
have_ssl = YES
innodb_fast_shutdown = 1
innodb_flush_sync = OFF
innodb_log_checksums = OFF
log_output = TABLE
log_statements_unsafe_for_binlog = OFF
optimizer_trace = enabled=off,one_line=off
optimizer_trace_features = greedy_search=on, range_optimizer=on, dynamic_range=on, repeated_subselect=on
relay_log_recovery = ON
session_track_system_variables = time_zone, autocommit, character_set_client, character_set_results, character_set_connection
time_zone = UTC
transaction_isolation = READ-UNCOMMITTED
tx_isolation = READ-UNCOMMITTED