I have simplified a much larger table below:
CREATE TABLE `core` (
`id` int NOT NULL,
`loc_country` enum('United States','Colombia','United Kingdom',
'Australia','India','Germany','Canada','Korea','Netherlands',
'200 more') CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci NOT NULL,
`loc_city` varchar(32) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_as_ci DEFAULT NULL,
`job` enum('a','b','c','d') CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `loc_country_2` (`loc_country`,`job`,`loc_city`(6))
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci
ROW_FORMAT=COMPRESSED
explain format=json
SELECT id FROM core
WHERE id!=518601449
AND loc_country='Mongolia'
AND id < 518601449
AND job='a'
LIMIT 151\G
*************************** 1. row ***************************
EXPLAIN: {
"query_block": {
"select_id": 1,
"cost_info": {
"query_cost": "14002.99"
},
"table": {
"table_name": "core",
"access_type": "range",
"possible_keys": [
"PRIMARY",
"loc_country_2"
],
"key": "loc_country_2",
"used_key_parts": [
"loc_country",
"job",
"loc_city",
"id"
],
"key_length": "34",
"rows_examined_per_scan": 45657,
"rows_produced_per_join": 45657,
"filtered": "100.00",
"using_index_for_skip_scan": true,
"cost_info": {
"read_cost": "9437.29",
"eval_cost": "4565.70",
"prefix_cost": "14002.99",
"data_read_per_join": "1G"
},
"used_columns": [
"id",
"loc_country",
"job"
],
"attached_condition": "((`api`.`core`.`job` = 'a') and (`api`.`core`.`loc_country` = 'Mongolia') and (`api`.`core`.`id` <> 518601449) and (`api`.`core`.`id` < 518601449))"
}
}
}
The query took 14 seconds to run, I need it to be done in 0.01 seconds
The biggest problem seems to be to use id < XXX and order by id, I thought that this should be "free" to use given that the id is primary key.
I need the id < and sort because I need to get a different portion out of the database with each query, if I'd not use it I would receive the same data with each query for each country+job
I can not parition the table as I have dozens of such queries using different columns, it's just one example.
I believe the compression makes a big impact, it might be responsible for most issues I have though I don't have the storage on my NVME disks to run without compression.
Would it help to add the primary key to the indexes I have? At the end ? I fear it would waste a lot of storage space.
Any ideas ?
Best Answer
ASPECT #1: Indexing
There is no need to add the PRIMARY KEY value
id
to the index. Why ?According to the MySQL Documentation of Clustered and Secondary Indexes, Paragraphs 1,2 under the Subheading
How Secondary Indexes Relate to the Clustered Index
say the following:Thus, the primary key is automatically added to a secondary index.
The proof of this is in the XML output
and
ASPECT #2: Table Compression
Compression might be greatly contributing to the slowness. After all, InnoDB must decompress data coming from that table for it to reside in the InnoDB Buffer Pool. In addition, the InnoDB Buffer Pool holds both compressed and uncompressed data and index pages. How so ???
According to the MySQL Documentation on How Compression Works for InnoDB Tables , Paragraphs 1,2 under the Subheading
Compression and the InnoDB Buffer Pool
say the following:SUGGESTION #1
Since the Buffer Pool holds both compressed and uncompressed pages, maybe increasing the InnoDB Buffer Pool size might slightly improve things. How ?
InnoDB has to evict unused pages from the Buffer Pool. A larger buffer pool reduces the number of times the Buffer Pool hasd to be cleared to make waves for new pages.
SUGGESTION #2
Perhaps changing the size of compressed pages might slightly improve things. This would require reloading the table with the new compression size. Please see my old post innodb_file_format Barracuda on changing the
KEY_BLOCK_SIZE
would be needed.UPDATE 2020-12-12 19:52
Here is a crazy trick to try
Instead of your query
Try refactoring as follows
This will force the loc_country_2 index to gather ids in the subqueries first. Then, all IDs < 518601449 are dismissed. Finally, the LIMIT 151 is imposed.
I do not guarantee the result will be better. It could be worse.
You won't know until you try it out and see or at least look at the EXPLAIN plan !!!