I created such table (similar to example from http://use-the-index-luke.com/sql/example-schema/postgresql/performance-testing-scalability )
CREATE TABLE scale_data (
section NUMERIC NOT NULL,
id1 NUMERIC NOT NULL, -- unique values simulating ID or Timestamp
id2 NUMERIC NOT NULL -- a kind of Type
);
Populate it with:
INSERT INTO scale_data
SELECT sections.sections, sections.sections*10000 + gen.gen
, CEIL(RANDOM()*100)
FROM GENERATE_SERIES(1, 300) sections,
GENERATE_SERIES(1, 90000) gen
WHERE gen <= sections * 300;
It generated 13545000 records.
Composite index on it:
CREATE INDEX id1_id2_idx
ON public.scale_data
USING btree
(id1, id2);
And select#1:
select id2 from scale_data
where id2 in (50)
order by id1 desc
limit 500
Explain analyze:
"Limit (cost=0.56..1177.67 rows=500 width=11) (actual time=0.046..5.124 rows=500 loops=1)"
" -> Index Only Scan Backward using id1_id2_idx on scale_data (cost=0.56..311588.74 rows=132353 width=11) (actual time=0.045..5.060 rows=500 loops=1)"
" Index Cond: (id2 = '50'::numeric)"
" Heap Fetches: 0"
"Planning time: 0.103 ms"
"Execution time: 5.177 ms"
Select#2 –more values in IN – plan has changed
select id2 from scale_data
where id2 in (50, 52)
order by id1 desc
limit 500
Explain analyze#2:
"Limit (cost=0.56..857.20 rows=500 width=11) (actual time=0.061..8.703 rows=500 loops=1)"
" -> Index Only Scan Backward using id1_id2_idx on scale_data (cost=0.56..445780.74 rows=260190 width=11) (actual time=0.059..8.648 rows=500 loops=1)"
" Filter: (id2 = ANY ('{50,52}'::numeric[]))"
" Rows Removed by Filter: 25030"
" Heap Fetches: 0"
"Planning time: 0.153 ms"
"Execution time: 8.771 ms"
Why plan differs?
Why in #1 it does show like Index condition, but in #2 Filter and number of index scanned cells.
Doesn't sql#1 traverse index in the same way like explain for sql#2 shows?
On real/production DB #2 works much slower, even if search by 2 keys separately is fast
PG 9.5
Best Answer
I wouldn't let this bug you.
FILTER
in that context, I believe just means more than one conditional statement on the index (which is howIN
and array operations get translated to, afaik). In either one, they're all executed underIndex Only Scan Backward
. It works the same way withOR