We have a mariadb table (stories) with above 1TB of data, periodically running a query that fetches recently added rows for indexing somewhere else.
innodb_version: 5.6.36-82.1
version : 10.1.26-MariaDB-0+deb9u1
The query works just fine when the query optimizer decides on using the secondary index to do a range walk through (in buckets of a 1000)
explain extended SELECT stories.item_guid
FROM `stories`
WHERE (updated_at >= '2018-09-21 15:00:00')
AND (updated_at <= '2018-09-22 05:30:00')
ORDER BY `stories`.`id` ASC
LIMIT 1000;
+------+-------------+---------+-------+-----------------------------+-----------------------------+---------+------+--------+----------+---------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+------+-------------+---------+-------+-----------------------------+-----------------------------+---------+------+--------+----------+---------------------------------------+
| 1 | SIMPLE | stories | range | index_stories_on_updated_at | index_stories_on_updated_at | 5 | NULL | 192912 | 100.00 | Using index condition; Using filesort |
+------+-------------+---------+-------+-----------------------------+-----------------------------+---------+------+--------+----------+---------------------------------------+
1 row in set, 1 warning (0.00 sec)
But occasionally, even with small differences in the data set (Note the second timestamp difference with the query above, worth to mention that the whole table holds data for several years and holds several dozens millions of rows) decides to use the primary key index
explain extended SELECT stories.item_guid
FROM `stories`
WHERE (updated_at >= '2018-09-21 15:00:00')
AND (updated_at <= '2018-09-22 06:30:00')
ORDER BY `stories`.`id` ASC
LIMIT 1000;
+------+-------------+---------+-------+-----------------------------+---------+---------+------+--------+----------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+------+-------------+---------+-------+-----------------------------+---------+---------+------+--------+----------+-------------+
| 1 | SIMPLE | stories | index | index_stories_on_updated_at | PRIMARY | 8 | NULL | 240259 | 83.81 | Using where |
+------+-------------+---------+-------+-----------------------------+---------+---------+------+--------+----------+-------------+
1 row in set, 1 warning (0.00 sec)
Causing it to walk through the whole primary key index (I guess sequentially) to then filter on updated_at
field taking several hours to complete instead.
The query was created by the ORM ActiveRecord and is probably far from ideal, as a workaround solution we came up with a couple of manually crafted queries, that moves the ORDER BY stories.id
out, and/or using use/force index to avoid the PK since we are really filtering our data set by updated_at
.
What I'm interested in understanding here is how/why the query optimizer chooses that execution plan, I understand that query optimizer uses index/table statistics for taking that decision but in this case, and If my understanding is correct about how innodb
works, this is pretty clear that walking through a huge PK while not using any PK id for filtering is not ideal.
I'm essentially trying to understand where that "bad" decision comes from, what statistics or unknowns variables are used not to end up with the good plan (the one that is often chosen and which is orders of magnitude faster)
feel free to correct any of my assumptions as I'm definitively not a DBA expert,
Thanks in advance!
Best Answer
Short Answer: The
WHERE
and theORDER BY
are tugging in different directions. The Optimizer is not yet smart enough to always correctly decide which direction to be pulled.Long Answer:
The
WHERE
benefits from any index starting withupdated_at
. See the "100%", and other things. Such an index does a quick job of finding the desired rows, all 192K of them. But the query then needs to sort of 192K rows (ORDER BY
) before it can get to theLIMIT
.The
ORDER BY id
benefits from thePRIMARY KEY
. This index lets the query get all rows in order (thereby avoiding a sort) and hence can get to theLIMIT
(thereby never shoveling around more than 1000 rows.If the 1000 are found at small value of
id
, it will run fast. If the desired 1000 are late in the table, the query will run slowly. The Optimizer can't predict (without a lot more smarts and complexity).Does
update_at
mostly trackid
? If so, then either index would lead to essentially the same blocks. But the Optimizer neither notices this, nor does it take advantage of it.A possible speedup:
A "covering" index is one that has all the needed columns. Since the PK is 'clustered' with the data,
PRIMARY KEY(id)
is sort of a covering index. But we saw how bad it could be. The following may be better:The query would range-scan the 192K rows faster than if it needed to bounce between
INDEX(updated_at)
and the data 192K times to finditem_guid
. The sort would not be improved.Alas, that speeds up the faster query. So, let me put my thinking cap back on.
Partitioning?
Oh, you may have found a 5th use case for
PARTITIONing
. Six years ago, I had only 4 use cases; I have yet to find a new use case. Let me talk through how your case might be different.Supposed you used
PARTITION BY RANGE (TO_DAYS(updated_at))
on the table, setting up about 50 partitions. (If you need to purge 'old' data, Partitioning is excellent.)When setting up partitioning, one needs to rethink all the indexes. What indexes do you have now? I'll assume these:
In your case, perhaps not much would change:
What would happen to your queries?
Tackling the
WHERE
first would not change much. Yes, there would be partition pruning, plus the secondary index. Speed would stay about the same.By tackling the
ORDER BY
first, the query plan could also prune down to one (or two) partitions. Then the scan inid
order within that/those partition(s) would be 50 (or 25) times as fast. There would probably be an added sort, since the rows from separate partitions won't be in order.Other notes
Could we see
SHOW CREATE TABLE
? A 1TB table needs a lot of things to help with performance. If all your numbers areINTs
, then there might be a lot of space to recover by shrinking toMEDIUMINT
, etc. Normalization may be worth doing. If theGUID
is indexed, that must be a big burden on insert performance. Etc, etc.Ypercube's
If
ORDER BY updated_at, id
is OK for the task, then that eliminates the temptation to use the slow explain plan. AndINDEX(updated_at, id, item_guid)
becomes the best index. And partitioning becomes useless.Pagination
Not via OFFSET, remember where you left off. And this discusses how to deal with "left off" involving more than a single column.