I have a query that is being made 18x slower by the presence of a single WHERE
clause. The query plan is an index-only scan and the clause in question looks at one of the fields in the index.
This is the query and the offending clause is update_id < 40581
:
EXPLAIN (VERBOSE, ANALYZE, COSTS, BUFFERS)
WITH a AS (
SELECT e_match.account_id, for_time, for_eid,
originating_update_id, attr1, attr2, attr3
FROM combinations combo
INNER JOIN (SELECT account_id, time as for_time, eid AS for_eid,
originating_update_id
FROM entries
WHERE account_id = '1'
AND time <= '2017-11-30 15:57:35.267405+00' AND
originating_update_id < 40581) e_match
ON combo.account_id = e_match.account_id)
SELECT a.attr1, a.attr2, a.attr3, b.amount, for_time, for_eid,
originating_update_id, b.eid = a.for_eid AS is_stored
FROM a
CROSS JOIN LATERAL (SELECT time, update_id, amount, eid
FROM updates
WHERE account_id = a.account_id AND
attr1 = a.attr1 AND
attr2 = a.attr2 AND
attr3 = a.attr3 AND
update_id < 40581 AND -- THIS CLAUSE MAKES IT SLOW
time <= a.for_time AND
(time < a.for_time OR
(time = a.for_time AND
originating_update_id <= a.originating_update_id))
ORDER BY time DESC,
originating_update_id DESC,
update_id DESC
LIMIT 1) b
ORDER BY for_time, originating_update_id, for_eid, attr1, attr2, attr3;
It takes 14241.977 ms to run, but if I remove the update_id < 40581
it runs in 797.601 ms.
The query is relatively complex. It joins a table of entries with a table of all updates for entries to use the latest entry for each of the known combinations of attributes. This is the loose-index-scan emulation trick. According to the query plan most of the time is spent in the SELECT
on the updates table and it is that time that varies between the two versions.
The tables are structured as follows:
Table "public.updates"
Column | Type | Collation | Nullable | Default
-----------------------+-----------------------------+-----------+----------+---------
account_id | text | | not null |
attr1 | text | | not null |
eid | bigint | | not null |
as_of_eid | bigint | | not null |
update_id | bigint | | not null |
originating_update_id | bigint | | not null |
time | timestamp without time zone | | not null |
attr2 | integer | | not null |
attr3 | text | | not null |
amount | numeric | | not null |
Indexes:
"updates_pkey" PRIMARY KEY, btree (eid, as_of_eid, account_id, attr1)
"updates_update_id_eid_account_id_attr1_key" UNIQUE CONSTRAINT, btree (update_id, eid, account_id, attr1)
"updates_covering_idx2" btree (account_id, attr1, attr2, attr3, time DESC, update_id DESC, originating_update_id DESC, amount, eid)
"updates_covering_idx3" btree (account_id, attr1, attr2, attr3, time DESC, originating_update_id DESC, update_id DESC, amount, eid)
Table "public.entries"
Column | Type | Collation | Nullable | Default
-----------------------+-----------------------------+-----------+----------+---------
account_id | text | | not null |
eid | bigint | | not null |
time | timestamp without time zone | | |
originating_update_id | bigint | | |
Indexes:
"entries2_pkey" PRIMARY KEY, btree (account_id, eid)
"entries_by_time" btree (account_id, time, eid, originating_update_id)
Table "public.combinations"
Column | Type | Collation | Nullable | Default
--------------+---------+-----------+----------+---------
account_id | text | | not null |
attr1 | text | | not null |
attr2 | integer | | not null |
attr3 | text | | not null |
Indexes:
"combinations_pkey" PRIMARY KEY, btree (account_id, attr1, attr2, attr3)
This is the query plan from the slow run with the offending clause:
Sort (cost=65859.13..65983.86 rows=49889 width=107) (actual time=14237.002..14239.136 rows=12967 loops=1)
Output: a.attr1, a.attr2, a.attr3, updates.amount, a.for_time, a.for_eid, a.originating_update_id, ((updates.eid = a.for_eid))
Sort Key: a.for_time, a.originating_update_id, a.for_eid, a.attr1, a.attr2, a.attr3
Sort Method: quicksort Memory: 2208kB
Buffers: shared hit=849601
CTE a
-> Nested Loop (cost=0.83..1279.76 rows=49889 width=71) (actual time=0.035..16.200 rows=18102 loops=1)
Output: entries.account_id, entries.time, entries.eid, entries.originating_update_id, combo.attr1, combo.attr2, combo.attr3
Buffers: shared hit=69
-> Index Only Scan using entries_by_time on public.entries (cost=0.42..651.60 rows=7127 width=57) (actual time=0.021..1.895 rows=2586 loops=1)
Output: entries.account_id, entries.time, entries.eid, entries.originating_update_id
Index Cond: ((entries.account_id = '1'::text) AND (entries.time <= '2017-11-30 15:57:35.267405'::timestamp without time zone) AND (entries.originating_update_id < 40581))
Heap Fetches: 0
Buffers: shared hit=65
-> Materialize (cost=0.41..4.57 rows=7 width=50) (actual time=0.000..0.002 rows=7 loops=2586)
Output: combo.attr1, combo.attr2, combo.attr3, combo.account_id
Buffers: shared hit=4
-> Index Only Scan using combinations_pkey on public.combinations combo (cost=0.41..4.53 rows=7 width=50) (actual time=0.010..0.012 rows=7 loops=1)
Output: combo.attr1, combo.attr2, combo.attr3, combo.account_id
Index Cond: (combo.account_id = '1'::text)
Heap Fetches: 0
Buffers: shared hit=4
-> Nested Loop (cost=0.69..57784.93 rows=49889 width=107) (actual time=0.117..14217.115 rows=12967 loops=1)
Output: a.attr1, a.attr2, a.attr3, updates.amount, a.for_time, a.for_eid, a.originating_update_id, (updates.eid = a.for_eid)
Buffers: shared hit=849601
-> CTE Scan on a (cost=0.00..997.78 rows=49889 width=124) (actual time=0.037..30.353 rows=18102 loops=1)
Output: a.account_id, a.for_time, a.for_eid, a.originating_update_id, a.attr1, a.attr2, a.attr3
Buffers: shared hit=69
-> Limit (cost=0.69..1.12 rows=1 width=39) (actual time=0.783..0.783 rows=1 loops=18102)
Output: updates.time, updates.update_id, updates.amount, updates.eid, updates.originating_update_id
Buffers: shared hit=849532
-> Index Only Scan using updates_covering_idx3 on public.updates (cost=0.69..777.72 rows=1825 width=39) (actual time=0.782..0.782 rows=1 loops=18102)
Output: updates.time, updates.update_id, updates.amount, updates.eid, updates.originating_update_id
Index Cond: ((updates.account_id = a.account_id) AND (updates.attr1 = a.attr1) AND (updates.attr2 = a.attr2) AND (updates.attr3 = a.attr3) AND (updates.time <= a.for_time) AND (updates.update_id < 40581))
Filter: ((updates.time < a.for_time) OR ((updates.time = a.for_time) AND (updates.originating_update_id <= a.originating_update_id)))
Rows Removed by Filter: 6
Heap Fetches: 0
Buffers: shared hit=849532
Planning time: 0.446 ms
Execution time: 14241.977 ms
(40 rows)
and the fast run without the clause:
Sort (cost=58914.11..59038.83 rows=49889 width=107) (actual time=792.649..794.853 rows=12970 loops=1)
Output: a.attr1, a.attr2, a.attr3, updates.amount, a.for_time, a.for_eid, a.originating_update_id, ((updates.eid = a.for_eid))
Sort Key: a.for_time, a.originating_update_id, a.for_eid, a.attr1, a.attr2, a.attr3
Sort Method: quicksort Memory: 2208kB
Buffers: shared hit=175954
CTE a
-> Nested Loop (cost=0.83..1279.76 rows=49889 width=71) (actual time=0.036..15.278 rows=18102 loops=1)
Output: entries.account_id, entries.time, entries.eid, entries.originating_update_id, combo.attr1, combo.attr2, combo.attr3
Buffers: shared hit=69
-> Index Only Scan using entries_by_time on public.entries (cost=0.42..651.60 rows=7127 width=57) (actual time=0.022..1.675 rows=2586 loops=1)
Output: entries.account_id, entries.time, entries.eid, entries.originating_update_id
Index Cond: ((entries.account_id = '1'::text) AND (entries.time <= '2017-11-30 15:57:35.267405'::timestamp without time zone) AND (entries.originating_update_id < 40581))
Heap Fetches: 0
Buffers: shared hit=65
-> Materialize (cost=0.41..4.57 rows=7 width=50) (actual time=0.000..0.002 rows=7 loops=2586)
Output: combo.attr1, combo.attr2, combo.attr3, combo.account_id
Buffers: shared hit=4
-> Index Only Scan using combinations_pkey on public.combinations combo (cost=0.41..4.53 rows=7 width=50) (actual time=0.011..0.014 rows=7 loops=1)
Output: combo.attr1, combo.attr2, combo.attr3, combo.account_id
Index Cond: (combo.account_id = '1'::text)
Heap Fetches: 0
Buffers: shared hit=4
-> Nested Loop (cost=0.69..50839.90 rows=49889 width=107) (actual time=0.115..774.342 rows=12970 loops=1)
Output: a.attr1, a.attr2, a.attr3, updates.amount, a.for_time, a.for_eid, a.originating_update_id, (updates.eid = a.for_eid)
Buffers: shared hit=175954
-> CTE Scan on a (cost=0.00..997.78 rows=49889 width=124) (actual time=0.039..28.176 rows=18102 loops=1)
Output: a.account_id, a.for_time, a.for_eid, a.originating_update_id, a.attr1, a.attr2, a.attr3
Buffers: shared hit=69
-> Limit (cost=0.69..0.98 rows=1 width=39) (actual time=0.040..0.040 rows=1 loops=18102)
Output: updates.time, updates.update_id, updates.amount, updates.eid, updates.originating_update_id
Buffers: shared hit=175885
-> Index Only Scan using updates_covering_idx3 on public.updates (cost=0.69..808.79 rows=2820 width=39) (actual time=0.039..0.039 rows=1 loops=18102)
Output: updates.time, updates.update_id, updates.amount, updates.eid, updates.originating_update_id
Index Cond: ((updates.account_id = a.account_id) AND (updates.attr1 = a.attr1) AND (updates.attr2 = a.attr2) AND (updates.attr3 = a.attr3) AND (updates.time <= a.for_time))
Filter: ((updates.time < a.for_time) OR ((updates.time = a.for_time) AND (updates.originating_update_id <= a.originating_update_id)))
Rows Removed by Filter: 47
Heap Fetches: 0
Buffers: shared hit=175885
Planning time: 0.367 ms
Execution time: 797.601 ms
(40 rows)
There are 65788426 rows in the updates table and has been recently vacuumed and analyzed. The database settings are Amazon RDS default. I've renamed the tables and fields in the output above. Everything else is unabridged.
Why would that particular clause cause such a slowdown? The EXPLAIN ANALYZE
output is almost identical between the two. The main difference is the shared hit count.
Strangely, if I change to clause to a clause on some other field, e.g. originating_update_id < 100000000
the query doesn't slow down.
UPDATE
I've simplified the query as @jjanes suggested:
EXPLAIN (VERBOSE, ANALYZE, COSTS, BUFFERS)
SELECT time, update_id, amount, eid
FROM updates
WHERE account_id = '1' AND
attr1 = '' AND
attr2 = 3 AND
attr3 = 'ABC' AND
update_id < 40581 AND -- THIS CLAUSE MAKES IT SLOW
time <= '2017-12-30 15:57:35.267405+00' AND
(time < '2017-12-30 15:57:35.267405+00' OR
(time = '2017-12-30 15:57:35.267405+00' AND
originating_update_id <= 10000000))
ORDER BY time DESC,
originating_update_id DESC,
update_id DESC
LIMIT 1;
With the offending clause it takes 35.5ms, reproducibly. Without that clause it takes 0.073ms.
Query plan with clause:
Limit (cost=0.69..0.82 rows=1 width=39) (actual time=35.467..35.468 rows=1 loops=1)
Output: time, update_id, amount, eid, originating_update_id
Buffers: shared hit=1856
-> Index Only Scan using updates_covering_idx3 on public.updates (cost=0.69..731.25 rows=5423 width=39) (actual time=35.465..35.465 rows=1 loops=1)
Output: time, update_id, amount, eid, originating_update_id
Index Cond: ((updates.account_id = '1'::text) AND (updates.attr1 = ''::text) AND (updates.attr2 = 3) AND (updates.attr3 = 'ABC'::text) AND (updates.time <= '2017-12-30 15:57:35.267405'::timestamp without time zone) AND (updates.update_id < 40581))
Filter: ((updates.time < '2017-12-30 15:57:35.267405'::timestamp without time zone) OR ((updates.time = '2017-12-30 15:57:35.267405'::timestamp without time zone) AND (updates.originating_update_id <= 10000000)))
Heap Fetches: 0
Buffers: shared hit=1856
Planning time: 0.202 ms
Execution time: 35.493 ms
(11 rows)
and without
Limit (cost=0.69..0.78 rows=1 width=39) (actual time=0.045..0.045 rows=1 loops=1)
Output: time, update_id, amount, eid, originating_update_id
Buffers: shared hit=6
-> Index Only Scan using updates_covering_idx3 on public.updates (cost=0.69..762.10 rows=8383 width=39) (actual time=0.044..0.044 rows=1 loops=1)
Output: time, update_id, amount, eid, originating_update_id
Index Cond: ((updates.account_id = '1'::text) AND (updates.attr1 = ''::text) AND (updates.attr2 = 3) AND (updates.attr3 = 'ABC'::text) AND (updates.time <= '2017-12-30 15:57:35.267405'::timestamp without time zone))
Filter: ((updates.time < '2017-12-30 15:57:35.267405'::timestamp without time zone) OR ((updates.time = '2017-12-30 15:57:35.267405'::timestamp without time zone) AND (updates.originating_update_id <= 10000000)))
Heap Fetches: 0
Buffers: shared hit=6
Planning time: 0.201 ms
Execution time: 0.073 ms
(11 rows)
Best Answer
This kind of makes sense if your data are distributed in certain way. Without the
update_id < 40581
condition the very first entry in the index that matchesaccount_id = '1' AND attr1 = '' AND attr2 = 3 AND attr3 = 'ABC'
satisfies theLIMIT
clause, because the entries are ordered appropriately. With the condition in place it has to go and find one row with the matchingupdate_id
, which may be way down the list.