I have a table with about 2 million rows (growing at about 1 million every 3 months) and am having trouble with slow queries. I've tried changing my query and indexes (ensuring left most order) to no avail – can anyone help?
+---------------------------------+------------------------+
| Variable_name | Value |
+---------------------------------+------------------------+
| have_innodb | YES |
| ignore_builtin_innodb | OFF |
| innodb_adaptive_flushing | ON |
| innodb_adaptive_hash_index | ON |
| innodb_additional_mem_pool_size | 8388608 |
| innodb_autoextend_increment | 8 |
| innodb_autoinc_lock_mode | 1 |
| innodb_buffer_pool_instances | 1 |
| innodb_buffer_pool_size | 134217728 |
| innodb_change_buffering | all |
| innodb_checksums | ON |
| innodb_commit_concurrency | 0 |
| innodb_concurrency_tickets | 500 |
| innodb_data_file_path | ibdata1:10M:autoextend |
| innodb_data_home_dir | |
| innodb_doublewrite | ON |
| innodb_fast_shutdown | 1 |
| innodb_file_format | Antelope |
| innodb_file_format_check | ON |
| innodb_file_format_max | Antelope |
| innodb_file_per_table | OFF |
| innodb_flush_log_at_trx_commit | 1 |
| innodb_flush_method | |
| innodb_force_load_corrupted | OFF |
| innodb_force_recovery | 0 |
| innodb_io_capacity | 200 |
| innodb_large_prefix | OFF |
| innodb_lock_wait_timeout | 50 |
| innodb_locks_unsafe_for_binlog | OFF |
| innodb_log_buffer_size | 8388608 |
| innodb_log_file_size | 5242880 |
| innodb_log_files_in_group | 2 |
| innodb_log_group_home_dir | ./ |
| innodb_max_dirty_pages_pct | 75 |
| innodb_max_purge_lag | 0 |
| innodb_mirrored_log_groups | 1 |
| innodb_old_blocks_pct | 37 |
| innodb_old_blocks_time | 0 |
| innodb_open_files | 300 |
| innodb_print_all_deadlocks | OFF |
| innodb_purge_batch_size | 20 |
| innodb_purge_threads | 0 |
| innodb_random_read_ahead | OFF |
| innodb_read_ahead_threshold | 56 |
| innodb_read_io_threads | 4 |
| innodb_replication_delay | 0 |
| innodb_rollback_on_timeout | OFF |
| innodb_rollback_segments | 128 |
| innodb_spin_wait_delay | 6 |
| innodb_stats_method | nulls_equal |
| innodb_stats_on_metadata | ON |
| innodb_stats_sample_pages | 8 |
| innodb_strict_mode | OFF |
| innodb_support_xa | ON |
| innodb_sync_spin_loops | 30 |
| innodb_table_locks | ON |
| innodb_thread_concurrency | 0 |
| innodb_thread_sleep_delay | 10000 |
| innodb_use_native_aio | OFF |
| innodb_use_sys_malloc | ON |
| innodb_version | 5.5.37 |
| innodb_write_io_threads | 4 |
+---------------------------------+------------------------+
mysql> show variables like '%size%';
+---------------------------------------------------+----------------------+
| Variable_name | Value |
+---------------------------------------------------+----------------------+
| binlog_cache_size | 32768 |
| binlog_stmt_cache_size | 32768 |
| bulk_insert_buffer_size | 8388608 |
| delayed_queue_size | 1000 |
| innodb_additional_mem_pool_size | 8388608 |
| innodb_buffer_pool_size | 134217728 |
| innodb_log_buffer_size | 8388608 |
| innodb_log_file_size | 5242880 |
| innodb_purge_batch_size | 20 |
| join_buffer_size | 131072 |
| key_buffer_size | 16777216 |
| key_cache_block_size | 1024 |
| large_page_size | 0 |
| max_binlog_cache_size | 18446744073709547520 |
| max_binlog_size | 104857600 |
| max_binlog_stmt_cache_size | 18446744073709547520 |
| max_heap_table_size | 67108864 |
| max_join_size | 18446744073709551615 |
| max_long_data_size | 16777216 |
| max_relay_log_size | 0 |
| metadata_locks_cache_size | 1024 |
| myisam_data_pointer_size | 6 |
| myisam_max_sort_file_size | 9223372036853727232 |
| myisam_mmap_size | 18446744073709551615 |
| myisam_sort_buffer_size | 8388608 |
| performance_schema_events_waits_history_long_size | 10000 |
| performance_schema_events_waits_history_size | 10 |
| preload_buffer_size | 32768 |
| profiling_history_size | 15 |
| query_alloc_block_size | 8192 |
| query_cache_size | 16777216 |
| query_prealloc_size | 8192 |
| range_alloc_block_size | 4096 |
| read_buffer_size | 131072 |
| read_rnd_buffer_size | 262144 |
| sort_buffer_size | 2097152 |
| sql_max_join_size | 18446744073709551615 |
| thread_cache_size | 8 |
| tmp_table_size | 67108864 |
| transaction_alloc_block_size | 8192 |
| transaction_prealloc_size | 4096 |
+---------------------------------------------------+----------------------+
| events | CREATE TABLE `events` (
`ID` int(11) NOT NULL AUTO_INCREMENT,
`ticketSource` varchar(255) NOT NULL,
`ticketStatus` varchar(6) NOT NULL DEFAULT 'CREATE' COMMENT 'CREATE or UPDATE',
`ticketRef` varchar(255) DEFAULT NULL,
`externalEventID` varchar(255) DEFAULT NULL,
`eventName` varchar(255) DEFAULT NULL,
`eventDescription` mediumtext,
`eventStatus` varchar(45) DEFAULT 'healthy',
`eventPriority` tinyint(6) NOT NULL DEFAULT '5',
`eventStateful` tinyint(1) NOT NULL DEFAULT '0',
`deviceID` varchar(45) DEFAULT NULL ,
`deviceName` varchar(255) DEFAULT NULL,
`deviceIpAddress` varchar(15) DEFAULT NULL,
`customerName` varchar(255) DEFAULT NULL,
`dateCreated` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
`dateUpdated` datetime NOT NULL,
`externalTicketRef` varchar(255) DEFAULT NULL,
`eventNotes` text,
`bID` varchar(45) DEFAULT NULL,
`suppressed` tinyint(1) NOT NULL DEFAULT '0',
`processed` tinyint(4) NOT NULL DEFAULT '0',
`rString` text,
`rUrl` varchar(255) DEFAULT NULL,
`rDuring` varchar(255) DEFAULT NULL,
`rh` varchar(255) DEFAULT NULL,
`eventCleared` tinyint(4) NOT NULL DEFAULT '0',
PRIMARY KEY (`ID`),
KEY `proc_tickStat_date_evtStat_cust` (`processed`,`ticketStatus`,`dateCreated`),
KEY `proc_tStat_eName_dName_dUpdate` (`processed`,`ticketStatus`,`eventName`,`deviceName`,`dateUpdated`),
KEY `proc_tickStat_date_cust_eName` (`processed`,`ticketStatus`,`dateCreated`,`customerName`,`eventName`),
KEY `percent_suppressed` (`dateCreated`,`processed`,`suppressed`),
KEY `scheduler_idx` (`deviceName`,`ticketRef`,`ticketStatus`,`processed`,`dateUpdated`),
KEY `dName_extEvent` (`deviceName`,`externalEventID`),
KEY `trends` (`customerName`(50),`processed`,`ticketStatus`,`dateCreated`,`eventName`(128),`eventPriority`)
) ENGINE=InnoDB AUTO_INCREMENT=2277803 DEFAULT CHARSET=utf8
mysql> show indexes from events;
+--------+------------+---------------------------------+--------------+-----------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+--------+------------+---------------------------------+--------------+-----------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| events | 0 | PRIMARY | 1 | ID | A | 1757339 | NULL | NULL | | BTREE | | |
| events | 1 | proc_tickStat_date_evtStat_cust | 1 | processed | A | 10 | NULL | NULL | | BTREE | | |
| events | 1 | proc_tickStat_date_evtStat_cust | 2 | ticketStatus | A | 10 | NULL | NULL | | BTREE | | |
| events | 1 | proc_tickStat_date_evtStat_cust | 3 | dateCreated | A | 1757339 | NULL | NULL | | BTREE | | |
| events | 1 | proc_tStat_eName_dName_dUpdate | 1 | processed | A | 10 | NULL | NULL | | BTREE | | |
| events | 1 | proc_tStat_eName_dName_dUpdate | 2 | ticketStatus | A | 10 | NULL | NULL | | BTREE | | |
| events | 1 | proc_tStat_eName_dName_dUpdate | 3 | eventName | A | 10 | NULL | NULL | YES | BTREE | | |
| events | 1 | proc_tStat_eName_dName_dUpdate | 4 | deviceName | A | 13213 | NULL | NULL | YES | BTREE | | |
| events | 1 | proc_tStat_eName_dName_dUpdate | 5 | dateUpdated | A | 1757339 | NULL | NULL | | BTREE | | |
| events | 1 | proc_tickStat_date_cust_eName | 1 | processed | A | 10 | NULL | NULL | | BTREE | | |
| events | 1 | proc_tickStat_date_cust_eName | 2 | ticketStatus | A | 10 | NULL | NULL | | BTREE | | |
| events | 1 | proc_tickStat_date_cust_eName | 3 | dateCreated | A | 1757339 | NULL | NULL | | BTREE | | |
| events | 1 | proc_tickStat_date_cust_eName | 4 | customerName | A | 1757339 | NULL | NULL | YES | BTREE | | |
| events | 1 | proc_tickStat_date_cust_eName | 5 | eventName | A | 1757339 | NULL | NULL | YES | BTREE | | |
| events | 1 | percent_suppressed | 1 | dateCreated | A | 1757339 | NULL | NULL | | BTREE | | |
| events | 1 | percent_suppressed | 2 | processed | A | 1757339 | NULL | NULL | | BTREE | | |
| events | 1 | percent_suppressed | 3 | suppressed | A | 1757339 | NULL | NULL | | BTREE | | |
| events | 1 | scheduler_idx | 1 | deviceName | A | 2183 | NULL | NULL | YES | BTREE | | |
| events | 1 | scheduler_idx | 2 | ticketRef | A | 1757339 | NULL | NULL | YES | BTREE | | |
| events | 1 | scheduler_idx | 3 | ticketStatus | A | 1757339 | NULL | NULL | | BTREE | | |
| events | 1 | scheduler_idx | 4 | processed | A | 1757339 | NULL | NULL | | BTREE | | |
| events | 1 | scheduler_idx | 5 | dateUpdated | A | 1757339 | NULL | NULL | | BTREE | | |
| events | 1 | dName_extEvent | 1 | deviceName | A | 5183 | NULL | NULL | YES | BTREE | | |
| events | 1 | dName_extEvent | 2 | externalEventID | A | 1757339 | NULL | NULL | YES | BTREE | | |
| events | 1 | trends | 1 | customerName | A | 10 | 50 | NULL | YES | BTREE | | |
| events | 1 | trends | 2 | processed | A | 10 | NULL | NULL | | BTREE | | |
| events | 1 | trends | 3 | ticketStatus | A | 10 | NULL | NULL | | BTREE | | |
| events | 1 | trends | 4 | dateCreated | A | 1757339 | NULL | NULL | | BTREE | | |
| events | 1 | trends | 5 | eventName | A | 1757339 | 128 | NULL | YES | BTREE | | |
| events | 1 | trends | 6 | eventPriority | A | 1757339 | NULL | NULL | | BTREE | | |
+--------+------------+---------------------------------+--------------+-----------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
mysql> show table status like 'events';
+--------+--------+---------+------------+---------+----------------+-------------+-----------------+--------------+------------+----------------+---------------------+-------------+------------+-----------------+----------+----------------+---------+
| Name | Engine | Version | Row_format | Rows | Avg_row_length | Data_length | Max_data_length | Index_length | Data_free | Auto_increment | Create_time | Update_time | Check_time | Collation | Checksum | Create_options | Comment |
+--------+--------+---------+------------+---------+----------------+-------------+-----------------+--------------+------------+----------------+---------------------+-------------+------------+-----------------+----------+----------------+---------+
| events | InnoDB | 10 | Compact | 2018434 | 1648 | 3327131648 | 0 | 1028816896 | 1757413376 | 2294083 | 2015-11-13 09:16:08 | NULL | NULL | utf8_general_ci | NULL | | |
+--------+--------+---------+------------+---------+----------------+-------------+-----------------+--------------+------------+----------------+---------------------+-------------+------------+-----------------+----------+----------------+---------+
The explain:
EXPLAIN
SELECT DATE_FORMAT(dateCreated, '%Y-%m-%d') as date, eventPriority,
count(*) as total
FROM `events`
WHERE customerName="Customer A"
AND processed=1
AND ticketStatus="CREATE"
AND dateCreated >= "2015-07-01"
AND dateCreated <= "2015-10-01"
GROUP BY date, eventPriority
ORDER BY `date` ASC;
--------+------+--------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+--------+-------+------------------------------------------------------------------------------------------------------------------------+--------+---------+------+--------+----------------------------------------------+
| 1 | SIMPLE | events | range | proc_tickStat_date_evtStat_cust,proc_tStat_eName_dName_dUpdate,proc_tickStat_date_cust_eName,percent_suppressed,trends | trends | 178 | NULL | 141226 | Using where; Using temporary; Using filesort |
+----+-------------+--------+-------+------------------------------------------------------------------------------------------------------------------------+--------+---------+------+--------+----------------------------------------------+
368 rows in set (18.40 sec)
I have a number of similar queries so I'm trying to reuse indexes where possible and order my queries in code to match left most as much as possible:
SELECT `deviceName`, `suppressed`, `ticketRef`
FROM `events`
WHERE customerName 'Customer A'
AND processed=1
AND ticketStatus = 'CREATE'
AND dateCreated >= '2015-07-01'
AND dateCreated <= '2015-10-01'
AND eventStatus != 'HEALTHY'
AND suppressed = 0
AND ticketSource IN ('a', 'b', 'c', 'd', 'e')
ORDER BY `ID` DESC
SELECT count(*) as total, eventName, eventName as id
FROM `events`
WHERE customerName="S"
AND processed=N
AND ticketStatus="S"
AND dateCreated >= "S"
AND dateCreated <= "S"
GROUP BY `eventName`
ORDER BY `total` DESC
LIMIT N
This is running on an ubuntu server with 2cores and 2GB ram (test server) – we have similar poor performance on an identical data set in production with 4 cores and 16GB ram
I'm tempted to move more logic from the database into code but I've seen posts with people handling 100million rows so I'm guessing its just poor indexing which is the problem – should I consider partionioning on date by month or shard it or is that overkill for this dataset ?
thanks
sam
Best Answer
Your query formatted a bit:
The explain shows that the
trends
index is used and it is defined asAs we can see, the index contains all the columns of your query such a way that we can quickly get all events of "Customer A" which are "processed" and have status "CREATED" and then we can just read the right range for
dateCreated
. That seems really good, but as we can see from theEXPLAIN
, there is one additional step after those - for all the index items found to satisfy the conditions it takes theID
(PRIMARY KEY
is a part of every secondary index in InnoDB) and locates the row in the "main" part of the table. The values of those IDs are quite randomly distributed (even if we expect that autoincrement somehow coincides with thedateCreated
, it is still probabůy not any continuous set, because events of many customers are inside that date range) so that means the query is doing ~140k random lookups by primary key to locate all the rows and build the temp table used for grouping.The index contains all the columns used in your query (even the
eventPriority
) so it looks like it might be covering and this last step not necessary. But because the`customerName`(50)
part, MySQL is afraid that it will not be able to exactly check thecustomerName
value, because if there were a row with text longer than 50 characters, the index would contain only a part of it. We can see that it is not needed for the "Customer A" as it is quite short - but this optimization is not yet implemented so MySQL just has to get the full row.When you remove that limit from the index, MySQL can see that it has all the data to resolve your query just by reading the index items (as even
eventPriority
is included - andeventName
but that can be skipped). So you save tens of thousands of "random" lookups because the index contains what you need and it is sorted such a way that it is accessible fast.===
There might be one more possibility for making it even faster. Currently the
GROUP BY date
means grouping by result of a function, that has to be computed for every matching row so it is not possible to use index for that. You might create another column in the table for storing the result ofDATE_FORMAT(dateCreated, '%Y-%m-%d')
- it might be managed by a trigger, by your application or using a new MySQL feature - generated columns. Then you would just add that to your index (probably the best place would be just beforedateCreated
).But it is probably not needed unless some customer generates really many events for some date range or if you needed to group very long ranges.