MySQL InnoDB – Optimize Slow Query with 2 Million Rows

indexinnodbMySQL

I have a table with about 2 million rows (growing at about 1 million every 3 months) and am having trouble with slow queries. I've tried changing my query and indexes (ensuring left most order) to no avail – can anyone help?

+---------------------------------+------------------------+
| Variable_name                   | Value                  |
+---------------------------------+------------------------+
| have_innodb                     | YES                    |
| ignore_builtin_innodb           | OFF                    |
| innodb_adaptive_flushing        | ON                     |
| innodb_adaptive_hash_index      | ON                     |
| innodb_additional_mem_pool_size | 8388608                |
| innodb_autoextend_increment     | 8                      |
| innodb_autoinc_lock_mode        | 1                      |
| innodb_buffer_pool_instances    | 1                      |
| innodb_buffer_pool_size         | 134217728              |
| innodb_change_buffering         | all                    |
| innodb_checksums                | ON                     |
| innodb_commit_concurrency       | 0                      |
| innodb_concurrency_tickets      | 500                    |
| innodb_data_file_path           | ibdata1:10M:autoextend |
| innodb_data_home_dir            |                        |
| innodb_doublewrite              | ON                     |
| innodb_fast_shutdown            | 1                      |
| innodb_file_format              | Antelope               |
| innodb_file_format_check        | ON                     |
| innodb_file_format_max          | Antelope               |
| innodb_file_per_table           | OFF                    |
| innodb_flush_log_at_trx_commit  | 1                      |
| innodb_flush_method             |                        |
| innodb_force_load_corrupted     | OFF                    |
| innodb_force_recovery           | 0                      |
| innodb_io_capacity              | 200                    |
| innodb_large_prefix             | OFF                    |
| innodb_lock_wait_timeout        | 50                     |
| innodb_locks_unsafe_for_binlog  | OFF                    |
| innodb_log_buffer_size          | 8388608                |
| innodb_log_file_size            | 5242880                |
| innodb_log_files_in_group       | 2                      |
| innodb_log_group_home_dir       | ./                     |
| innodb_max_dirty_pages_pct      | 75                     |
| innodb_max_purge_lag            | 0                      |
| innodb_mirrored_log_groups      | 1                      |
| innodb_old_blocks_pct           | 37                     |
| innodb_old_blocks_time          | 0                      |
| innodb_open_files               | 300                    |
| innodb_print_all_deadlocks      | OFF                    |
| innodb_purge_batch_size         | 20                     |
| innodb_purge_threads            | 0                      |
| innodb_random_read_ahead        | OFF                    |
| innodb_read_ahead_threshold     | 56                     |
| innodb_read_io_threads          | 4                      |
| innodb_replication_delay        | 0                      |
| innodb_rollback_on_timeout      | OFF                    |
| innodb_rollback_segments        | 128                    |
| innodb_spin_wait_delay          | 6                      |
| innodb_stats_method             | nulls_equal            |
| innodb_stats_on_metadata        | ON                     |
| innodb_stats_sample_pages       | 8                      |
| innodb_strict_mode              | OFF                    |
| innodb_support_xa               | ON                     |
| innodb_sync_spin_loops          | 30                     |
| innodb_table_locks              | ON                     |
| innodb_thread_concurrency       | 0                      |
| innodb_thread_sleep_delay       | 10000                  |
| innodb_use_native_aio           | OFF                    |
| innodb_use_sys_malloc           | ON                     |
| innodb_version                  | 5.5.37                 |
| innodb_write_io_threads         | 4                      |
+---------------------------------+------------------------+

mysql> show variables like '%size%';
+---------------------------------------------------+----------------------+
| Variable_name                                     | Value                |
+---------------------------------------------------+----------------------+
| binlog_cache_size                                 | 32768                |
| binlog_stmt_cache_size                            | 32768                |
| bulk_insert_buffer_size                           | 8388608              |
| delayed_queue_size                                | 1000                 |
| innodb_additional_mem_pool_size                   | 8388608              |
| innodb_buffer_pool_size                           | 134217728            |
| innodb_log_buffer_size                            | 8388608              |
| innodb_log_file_size                              | 5242880              |
| innodb_purge_batch_size                           | 20                   |
| join_buffer_size                                  | 131072               |
| key_buffer_size                                   | 16777216             |
| key_cache_block_size                              | 1024                 |
| large_page_size                                   | 0                    |
| max_binlog_cache_size                             | 18446744073709547520 |
| max_binlog_size                                   | 104857600            |
| max_binlog_stmt_cache_size                        | 18446744073709547520 |
| max_heap_table_size                               | 67108864             |
| max_join_size                                     | 18446744073709551615 |
| max_long_data_size                                | 16777216             |
| max_relay_log_size                                | 0                    |
| metadata_locks_cache_size                         | 1024                 |
| myisam_data_pointer_size                          | 6                    |
| myisam_max_sort_file_size                         | 9223372036853727232  |
| myisam_mmap_size                                  | 18446744073709551615 |
| myisam_sort_buffer_size                           | 8388608              |
| performance_schema_events_waits_history_long_size | 10000                |
| performance_schema_events_waits_history_size      | 10                   |
| preload_buffer_size                               | 32768                |
| profiling_history_size                            | 15                   |
| query_alloc_block_size                            | 8192                 |
| query_cache_size                                  | 16777216             |
| query_prealloc_size                               | 8192                 |
| range_alloc_block_size                            | 4096                 |
| read_buffer_size                                  | 131072               |
| read_rnd_buffer_size                              | 262144               |
| sort_buffer_size                                  | 2097152              |
| sql_max_join_size                                 | 18446744073709551615 |
| thread_cache_size                                 | 8                    |
| tmp_table_size                                    | 67108864             |
| transaction_alloc_block_size                      | 8192                 |
| transaction_prealloc_size                         | 4096                 |
+---------------------------------------------------+----------------------+

| events | CREATE TABLE `events` (
`ID` int(11) NOT NULL AUTO_INCREMENT,
  `ticketSource` varchar(255) NOT NULL,
  `ticketStatus` varchar(6) NOT NULL DEFAULT 'CREATE' COMMENT 'CREATE or UPDATE',
  `ticketRef` varchar(255) DEFAULT NULL,
  `externalEventID` varchar(255) DEFAULT NULL,
  `eventName` varchar(255) DEFAULT NULL,
  `eventDescription` mediumtext,
  `eventStatus` varchar(45) DEFAULT 'healthy',
  `eventPriority` tinyint(6) NOT NULL DEFAULT '5',
  `eventStateful` tinyint(1) NOT NULL DEFAULT '0',
  `deviceID` varchar(45) DEFAULT NULL ,
  `deviceName` varchar(255) DEFAULT NULL,
  `deviceIpAddress` varchar(15) DEFAULT NULL,
  `customerName` varchar(255) DEFAULT NULL,
  `dateCreated` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
  `dateUpdated` datetime NOT NULL,
  `externalTicketRef` varchar(255) DEFAULT NULL,
  `eventNotes` text,
  `bID` varchar(45) DEFAULT NULL,
  `suppressed` tinyint(1) NOT NULL DEFAULT '0',
  `processed` tinyint(4) NOT NULL DEFAULT '0',
  `rString` text,
  `rUrl` varchar(255) DEFAULT NULL,
  `rDuring` varchar(255) DEFAULT NULL,
  `rh` varchar(255) DEFAULT NULL,
  `eventCleared` tinyint(4) NOT NULL DEFAULT '0',
  PRIMARY KEY (`ID`),
  KEY `proc_tickStat_date_evtStat_cust` (`processed`,`ticketStatus`,`dateCreated`),
  KEY `proc_tStat_eName_dName_dUpdate` (`processed`,`ticketStatus`,`eventName`,`deviceName`,`dateUpdated`),
  KEY `proc_tickStat_date_cust_eName` (`processed`,`ticketStatus`,`dateCreated`,`customerName`,`eventName`),
  KEY `percent_suppressed` (`dateCreated`,`processed`,`suppressed`),
  KEY `scheduler_idx` (`deviceName`,`ticketRef`,`ticketStatus`,`processed`,`dateUpdated`),
  KEY `dName_extEvent` (`deviceName`,`externalEventID`),
  KEY `trends` (`customerName`(50),`processed`,`ticketStatus`,`dateCreated`,`eventName`(128),`eventPriority`)
) ENGINE=InnoDB AUTO_INCREMENT=2277803 DEFAULT CHARSET=utf8

mysql> show indexes from events;
+--------+------------+---------------------------------+--------------+-----------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table  | Non_unique | Key_name                        | Seq_in_index | Column_name     | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+--------+------------+---------------------------------+--------------+-----------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| events |          0 | PRIMARY                         |            1 | ID              | A         |     1757339 |     NULL | NULL   |      | BTREE      |         |               |
| events |          1 | proc_tickStat_date_evtStat_cust |            1 | processed       | A         |          10 |     NULL | NULL   |      | BTREE      |         |               |
| events |          1 | proc_tickStat_date_evtStat_cust |            2 | ticketStatus    | A         |          10 |     NULL | NULL   |      | BTREE      |         |               |
| events |          1 | proc_tickStat_date_evtStat_cust |            3 | dateCreated     | A         |     1757339 |     NULL | NULL   |      | BTREE      |         |               |
| events |          1 | proc_tStat_eName_dName_dUpdate  |            1 | processed       | A         |          10 |     NULL | NULL   |      | BTREE      |         |               |
| events |          1 | proc_tStat_eName_dName_dUpdate  |            2 | ticketStatus    | A         |          10 |     NULL | NULL   |      | BTREE      |         |               |
| events |          1 | proc_tStat_eName_dName_dUpdate  |            3 | eventName       | A         |          10 |     NULL | NULL   | YES  | BTREE      |         |               |
| events |          1 | proc_tStat_eName_dName_dUpdate  |            4 | deviceName      | A         |       13213 |     NULL | NULL   | YES  | BTREE      |         |               |
| events |          1 | proc_tStat_eName_dName_dUpdate  |            5 | dateUpdated     | A         |     1757339 |     NULL | NULL   |      | BTREE      |         |               |
| events |          1 | proc_tickStat_date_cust_eName   |            1 | processed       | A         |          10 |     NULL | NULL   |      | BTREE      |         |               |
| events |          1 | proc_tickStat_date_cust_eName   |            2 | ticketStatus    | A         |          10 |     NULL | NULL   |      | BTREE      |         |               |
| events |          1 | proc_tickStat_date_cust_eName   |            3 | dateCreated     | A         |     1757339 |     NULL | NULL   |      | BTREE      |         |               |
| events |          1 | proc_tickStat_date_cust_eName   |            4 | customerName    | A         |     1757339 |     NULL | NULL   | YES  | BTREE      |         |               |
| events |          1 | proc_tickStat_date_cust_eName   |            5 | eventName       | A         |     1757339 |     NULL | NULL   | YES  | BTREE      |         |               |
| events |          1 | percent_suppressed              |            1 | dateCreated     | A         |     1757339 |     NULL | NULL   |      | BTREE      |         |               |
| events |          1 | percent_suppressed              |            2 | processed       | A         |     1757339 |     NULL | NULL   |      | BTREE      |         |               |
| events |          1 | percent_suppressed              |            3 | suppressed      | A         |     1757339 |     NULL | NULL   |      | BTREE      |         |               |
| events |          1 | scheduler_idx                   |            1 | deviceName      | A         |        2183 |     NULL | NULL   | YES  | BTREE      |         |               |
| events |          1 | scheduler_idx                   |            2 | ticketRef       | A         |     1757339 |     NULL | NULL   | YES  | BTREE      |         |               |
| events |          1 | scheduler_idx                   |            3 | ticketStatus    | A         |     1757339 |     NULL | NULL   |      | BTREE      |         |               |
| events |          1 | scheduler_idx                   |            4 | processed       | A         |     1757339 |     NULL | NULL   |      | BTREE      |         |               |
| events |          1 | scheduler_idx                   |            5 | dateUpdated     | A         |     1757339 |     NULL | NULL   |      | BTREE      |         |               |
| events |          1 | dName_extEvent                  |            1 | deviceName      | A         |        5183 |     NULL | NULL   | YES  | BTREE      |         |               |
| events |          1 | dName_extEvent                  |            2 | externalEventID | A         |     1757339 |     NULL | NULL   | YES  | BTREE      |         |               |
| events |          1 | trends                          |            1 | customerName    | A         |          10 |       50 | NULL   | YES  | BTREE      |         |               |
| events |          1 | trends                          |            2 | processed       | A         |          10 |     NULL | NULL   |      | BTREE      |         |               |
| events |          1 | trends                          |            3 | ticketStatus    | A         |          10 |     NULL | NULL   |      | BTREE      |         |               |
| events |          1 | trends                          |            4 | dateCreated     | A         |     1757339 |     NULL | NULL   |      | BTREE      |         |               |
| events |          1 | trends                          |            5 | eventName       | A         |     1757339 |      128 | NULL   | YES  | BTREE      |         |               |
| events |          1 | trends                          |            6 | eventPriority   | A         |     1757339 |     NULL | NULL   |      | BTREE      |         |               |
+--------+------------+---------------------------------+--------------+-----------------+-----------+-------------+----------+--------+------+------------+---------+---------------+


mysql> show table status like 'events';
+--------+--------+---------+------------+---------+----------------+-------------+-----------------+--------------+------------+----------------+---------------------+-------------+------------+-----------------+----------+----------------+---------+
| Name   | Engine | Version | Row_format | Rows    | Avg_row_length | Data_length | Max_data_length | Index_length | Data_free  | Auto_increment | Create_time         | Update_time | Check_time | Collation       | Checksum | Create_options | Comment |
+--------+--------+---------+------------+---------+----------------+-------------+-----------------+--------------+------------+----------------+---------------------+-------------+------------+-----------------+----------+----------------+---------+
| events | InnoDB |      10 | Compact    | 2018434 |           1648 |  3327131648 |               0 |   1028816896 | 1757413376 |        2294083 | 2015-11-13 09:16:08 | NULL        | NULL       | utf8_general_ci |     NULL |                |         |
+--------+--------+---------+------------+---------+----------------+-------------+-----------------+--------------+------------+----------------+---------------------+-------------+------------+-----------------+----------+----------------+---------+

The explain:

EXPLAIN 
SELECT  DATE_FORMAT(dateCreated, '%Y-%m-%d') as date, eventPriority,
        count(*) as total
    FROM  `events`
    WHERE  customerName="Customer A"
      AND  processed=1
      AND  ticketStatus="CREATE"
      AND  dateCreated >= "2015-07-01"
      AND  dateCreated <= "2015-10-01"
    GROUP BY  date, eventPriority
    ORDER BY  `date` ASC;

--------+------+--------+----------------------------------------------+
| id | select_type | table  | type  | possible_keys                                                                                                          | key    | key_len | ref  | rows   | Extra                                        |
+----+-------------+--------+-------+------------------------------------------------------------------------------------------------------------------------+--------+---------+------+--------+----------------------------------------------+
|  1 | SIMPLE      | events | range | proc_tickStat_date_evtStat_cust,proc_tStat_eName_dName_dUpdate,proc_tickStat_date_cust_eName,percent_suppressed,trends | trends | 178     | NULL | 141226 | Using where; Using temporary; Using filesort |
+----+-------------+--------+-------+------------------------------------------------------------------------------------------------------------------------+--------+---------+------+--------+----------------------------------------------+

368 rows in set (18.40 sec)

I have a number of similar queries so I'm trying to reuse indexes where possible and order my queries in code to match left most as much as possible:

SELECT  `deviceName`, `suppressed`, `ticketRef`
    FROM  `events`
    WHERE  customerName 'Customer A'
      AND  processed=1
      AND  ticketStatus = 'CREATE'
      AND  dateCreated >= '2015-07-01'
      AND  dateCreated <= '2015-10-01'
      AND  eventStatus != 'HEALTHY'
      AND  suppressed = 0
      AND  ticketSource IN ('a', 'b', 'c', 'd', 'e')
    ORDER BY  `ID` DESC

SELECT  count(*) as total, eventName, eventName as id
    FROM  `events`
    WHERE  customerName="S"
      AND  processed=N
      AND  ticketStatus="S"
      AND  dateCreated >= "S"
      AND  dateCreated <= "S"
    GROUP BY  `eventName`
    ORDER BY  `total` DESC
    LIMIT  N 

This is running on an ubuntu server with 2cores and 2GB ram (test server) – we have similar poor performance on an identical data set in production with 4 cores and 16GB ram

I'm tempted to move more logic from the database into code but I've seen posts with people handling 100million rows so I'm guessing its just poor indexing which is the problem – should I consider partionioning on date by month or shard it or is that overkill for this dataset ?

thanks

sam

Best Answer

Your query formatted a bit:

SELECT DATE_FORMAT(dateCreated, '%Y-%m-%d') AS date,
       eventPriority,
       count(*) AS total
FROM `events`
WHERE customerName="Customer A"
  AND processed=1
  AND ticketStatus="CREATE"
  AND dateCreated >= "2015-07-01"
  AND dateCreated <= "2015-10-01"
GROUP BY date, eventPriority
ORDER BY `date` ASC;

The explain shows that the trends index is used and it is defined as

KEY `trends` (
    `customerName`(50),
    `processed`,
    `ticketStatus`,
    `dateCreated`,
    `eventName`(128),
    `eventPriority`
);

As we can see, the index contains all the columns of your query such a way that we can quickly get all events of "Customer A" which are "processed" and have status "CREATED" and then we can just read the right range for dateCreated. That seems really good, but as we can see from the EXPLAIN, there is one additional step after those - for all the index items found to satisfy the conditions it takes the ID (PRIMARY KEY is a part of every secondary index in InnoDB) and locates the row in the "main" part of the table. The values of those IDs are quite randomly distributed (even if we expect that autoincrement somehow coincides with the dateCreated, it is still probabůy not any continuous set, because events of many customers are inside that date range) so that means the query is doing ~140k random lookups by primary key to locate all the rows and build the temp table used for grouping.

The index contains all the columns used in your query (even the eventPriority) so it looks like it might be covering and this last step not necessary. But because the `customerName`(50) part, MySQL is afraid that it will not be able to exactly check the customerName value, because if there were a row with text longer than 50 characters, the index would contain only a part of it. We can see that it is not needed for the "Customer A" as it is quite short - but this optimization is not yet implemented so MySQL just has to get the full row.

When you remove that limit from the index, MySQL can see that it has all the data to resolve your query just by reading the index items (as even eventPriority is included - and eventName but that can be skipped). So you save tens of thousands of "random" lookups because the index contains what you need and it is sorted such a way that it is accessible fast.

===

There might be one more possibility for making it even faster. Currently the GROUP BY date means grouping by result of a function, that has to be computed for every matching row so it is not possible to use index for that. You might create another column in the table for storing the result of DATE_FORMAT(dateCreated, '%Y-%m-%d') - it might be managed by a trigger, by your application or using a new MySQL feature - generated columns. Then you would just add that to your index (probably the best place would be just before dateCreated).

But it is probably not needed unless some customer generates really many events for some date range or if you needed to group very long ranges.