I have the following query:
SELECT * FROM plane_tracks
WHERE created_at >= '2016-12-29 08:00:00' AND updated_at <= '2016-12-30 00:00:00';
I've created the following indexes:
create index plane_tracks_on_created_at on plane_tracks(created_at);
create index plane_tracks_on_updated_at on plane_tracks(updated_at);
create index plane_tracks_on_created_at_updated_at on plane_tracks(created_at,updated_at);
create index plane_tracks_on_created_at_updated_at_desc on plane_tracks(created_at,updated_at DESC);
Yet when I run the query, it scans all elements in plane_tracks.
# EXPLAIN ANALYZE SELECT * FROM plane_tracks WHERE created_at >= '2016-12-29 08:00:00' AND updated_at <= '2016-12-30 00:00:00';
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------
Seq Scan on plane_tracks (cost=0.00..582.78 rows=5425 width=471) (actual time=1.507..3.435 rows=1978 loops=1)
Filter: ((created_at >= '2016-12-29 08:00:00'::timestamp without time zone) AND (updated_at <= '2016-12-30 00:00:00'::timestamp without time zone))
Rows Removed by Filter: 15941
Planning time: 0.366 ms
Execution time: 3.618 ms
(5 rows)
When I drop the updated_at part, like this:
SELECT * FROM plane_tracks WHERE created_at >= '2016-12-29 08:00:00';
Then I get this:
# EXPLAIN ANALYZE SELECT * FROM plane_tracks WHERE created_at >= '2016-12-29 08:00:00';
QUERY PLAN
----------------------------------------------------------------------------------------------------------------
Seq Scan on plane_tracks (cost=0.00..537.99 rows=8595 width=471) (actual time=0.022..3.261 rows=8593 loops=1)
Filter: (created_at >= '2016-12-29 08:00:00'::timestamp without time zone)
Rows Removed by Filter: 9326
Planning time: 0.093 ms
Execution time: 3.901 ms
(5 rows)
At least this last part makes sense.
But I don't understand why
- the first query isn't at least as efficient as the second one.
- why first query isn't much better than the second one, given that there are also indexes on updated_at.
- what index to create to make it really efficient
Tom
Best Answer
A few things to try.
Drop these (unless you have other queries) and see if Pg takes the index. They're not needed and only serve to confuse things and make UPDATE/INSERT slower slower.
set enable_seqscan off
in psql and running the same query. Paste back the response. This will tell us whether or not Pg can even do it without aseqscan
. If PostgreSQL is still using a seq scan, it's a last resort, and then for some reason it feels it can not use compound indexplane_tracks(created_at,updated_at DESC)
, and we have to look at rewriting that to another compound index.ANALYZE plane_tracks
. And thenEXPLAIN ANALYZE
the query again to see if a new analysis has fixed this. If not,set default_statistics_target 1000;
and rerunningANALYZE plane_tracks
. If that works, then considerALTER TABLE plane_tracks SET STATISTICS 1000
(or something higher than whatdefault_statistics_target
is currently set at.Proof that it's possible with test data
As proof that it's possible with one index
Query plan
Default analytics come close guessing 3584, when it returns 3580. I almost never tinker with
default_statistics
or table statistics.