I'm using Postgres 9.3.4 and I have 4 queries that have very similar inputs but have vastly different response times:
Query #1
EXPLAIN ANALYZE SELECT posts.* FROM posts
WHERE posts.source_id IN (19082, 19075, 20705, 18328, 19110, 24965, 18329, 27600, 17804, 20717, 27598, 27599)
AND posts.deleted_at IS NULL
ORDER BY external_created_at desc
LIMIT 100 OFFSET 0;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.43..585.44 rows=100 width=1041) (actual time=326092.852..507360.199 rows=100 loops=1)
-> Index Scan using index_posts_on_external_created_at on posts (cost=0.43..14871916.35 rows=2542166 width=1041) (actual time=326092.301..507359.524 rows=100 loops=1)
Filter: (source_id = ANY ('{19082,19075,20705,18328,19110,24965,18329,27600,17804,20717,27598,27599}'::integer[]))
Rows Removed by Filter: 6913925
Total runtime: 507361.944 ms
Query #2
EXPLAIN ANALYZE SELECT posts.* FROM posts
WHERE posts.source_id IN (5202, 5203, 661, 659, 662, 627)
AND posts.deleted_at IS NULL
ORDER BY external_created_at desc
LIMIT 100 OFFSET 0;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=31239.64..31239.89 rows=100 width=1041) (actual time=2.004..2.038 rows=100 loops=1)
-> Sort (cost=31239.64..31261.26 rows=8648 width=1041) (actual time=2.003..2.017 rows=100 loops=1)
Sort Key: external_created_at
Sort Method: top-N heapsort Memory: 80kB
-> Index Scan using index_posts_on_source_id on posts (cost=0.44..30909.12 rows=8648 width=1041) (actual time=0.024..1.063 rows=944 loops=1)
Index Cond: (source_id = ANY ('{5202,5203,661,659,662,627}'::integer[]))
Filter: (deleted_at IS NULL)
Rows Removed by Filter: 109
Total runtime: 2.125 ms
Query #3
EXPLAIN ANALYZE SELECT posts.* FROM posts
WHERE posts.source_id IN (14790, 14787, 32928, 14796, 14791, 15503, 14789, 14772, 15506, 14794, 15543, 31615, 15507, 15508, 14800)
AND posts.deleted_at IS NULL
ORDER BY external_created_at desc
LIMIT 100 OFFSET 0;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.43..821.25 rows=100 width=1041) (actual time=19.224..55.599 rows=100 loops=1)
-> Index Scan using index_posts_on_external_created_at on posts (cost=0.43..14930351.58 rows=1818959 width=1041) (actual time=19.213..55.529 rows=100 loops=1)
Filter: (source_id = ANY ('{14790,14787,32928,14796,14791,15503,14789,14772,15506,14794,15543,31615,15507,15508,14800}'::integer[]))
Rows Removed by Filter: 414
Total runtime: 55.683 ms
Query #4
EXPLAIN ANALYZE SELECT posts.* FROM posts
WHERE posts.source_id IN (18766, 18130, 18128, 18129, 19705, 28252, 18264, 18126, 18767, 27603, 28657, 28654, 28655, 19706, 18330)
AND posts.deleted_at IS NULL
ORDER BY external_created_at desc
LIMIT 100 OFFSET 0;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.43..69055.29 rows=100 width=1041) (actual time=26.094..320.626 rows=100 loops=1)
-> Index Scan using index_posts_on_external_created_at on posts (cost=0.43..14930351.58 rows=21621 width=1041) (actual time=26.093..320.538 rows=100 loops=1)
Filter: (source_id = ANY ('{18766,18130,18128,18129,19705,28252,18264,18126,18767,27603,28657,28654,28655,19706,18330}'::integer[]))
Rows Removed by Filter: 6156
Total runtime: 320.778 ms
All 4 are the same, apart from looking at posts with different source_id
s.
Three of the four end up using the following index:
CREATE INDEX index_posts_on_external_created_at ON posts USING btree (external_created_at DESC)
WHERE (deleted_at IS NULL);
And the #2 uses this index:
CREATE INDEX index_posts_on_source_id ON posts USING btree (source_id);
What's interesting to me, is that of the 3 that use the index_posts_on_external_created_at
index, two are quite fast, while the other (#1) is insanely slow.
Query #2 has way less posts than the other 3 do, so that might explain why it uses the index_posts_on_source_id
index instead. However, if I get rid of the index_posts_on_external_created_at
index, the other 3 queries are extremely slow when forced to use the index_posts_on_source_id
index.
Here's my definition of the posts table:
CREATE TABLE posts (
id integer NOT NULL,
source_id integer,
message text,
image text,
external_id text,
created_at timestamp without time zone,
updated_at timestamp without time zone,
external text,
like_count integer DEFAULT 0 NOT NULL,
comment_count integer DEFAULT 0 NOT NULL,
external_created_at timestamp without time zone,
deleted_at timestamp without time zone,
poster_name character varying(255),
poster_image text,
poster_url character varying(255),
poster_id text,
position integer,
location character varying(255),
description text,
video text,
rejected_at timestamp without time zone,
deleted_by character varying(255),
height integer,
width integer
);
I've tried using CLUSTER posts USING index_posts_on_external_created_at
Which is essentially an index that orders by external_created_at and this seems to be the only effective method I've found. However, I am unable to use this on production as it causes a global lock for several hours while it runs. I'm on heroku, so I can't install pg_repack or anything like that.
Why would the #1 query be so slow, and others be really quick? What can I do to mitigate this?
EDIT: Here are my queries without the LIMIT
and ORDER
Query #1
EXPLAIN ANALYZE SELECT posts.* FROM posts
WHERE posts.source_id IN (19082, 19075, 20705, 18328, 19110, 24965, 18329, 27600, 17804, 20717, 27598, 27599)
AND posts.deleted_at IS NULL
ORDER BY external_created_at desc;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=7455044.81..7461163.56 rows=2447503 width=1089) (actual time=94903.143..95110.898 rows=238975 loops=1)
Sort Key: external_created_at
Sort Method: external merge Disk: 81440kB
-> Bitmap Heap Scan on posts (cost=60531.78..1339479.50 rows=2447503 width=1089) (actual time=880.150..90988.460 rows=238975 loops=1)
Recheck Cond: (source_id = ANY ('{19082,19075,20705,18328,19110,24965,18329,27600,17804,20717,27598,27599}'::integer[]))
Rows Removed by Index Recheck: 5484857
Filter: (deleted_at IS NULL)
Rows Removed by Filter: 3108465
-> Bitmap Index Scan on index_posts_on_source_id (cost=0.00..59919.90 rows=3267549 width=0) (actual time=877.904..877.904 rows=3347440 loops=1)
Index Cond: (source_id = ANY ('{19082,19075,20705,18328,19110,24965,18329,27600,17804,20717,27598,27599}'::integer[]))
Total runtime: 95534.724 ms
Query #2
EXPLAIN ANALYZE SELECT posts.* FROM posts
WHERE posts.source_id IN (5202, 5203, 661, 659, 662, 627)
AND posts.deleted_at IS NULL
ORDER BY external_created_at desc;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=36913.72..36935.85 rows=8852 width=1089) (actual time=212.450..212.549 rows=944 loops=1)
Sort Key: external_created_at
Sort Method: quicksort Memory: 557kB
-> Index Scan using index_posts_on_source_id on posts (cost=0.44..32094.90 rows=8852 width=1089) (actual time=1.732..209.590 rows=944 loops=1)
Index Cond: (source_id = ANY ('{5202,5203,661,659,662,627}'::integer[]))
Filter: (deleted_at IS NULL)
Rows Removed by Filter: 109
Total runtime: 214.507 ms
Query #3
EXPLAIN ANALYZE SELECT posts.* FROM posts
WHERE posts.source_id IN (14790, 14787, 32928, 14796, 14791, 15503, 14789, 14772, 15506, 14794, 15543, 31615, 15507, 15508, 14800)
AND posts.deleted_at IS NULL
ORDER BY external_created_at desc;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=5245032.87..5249894.14 rows=1944508 width=1089) (actual time=131032.952..134342.372 rows=1674072 loops=1)
Sort Key: external_created_at
Sort Method: external merge Disk: 854864kB
-> Bitmap Heap Scan on posts (cost=48110.86..1320005.55 rows=1944508 width=1089) (actual time=605.648..91351.334 rows=1674072 loops=1)
Recheck Cond: (source_id = ANY ('{14790,14787,32928,14796,14791,15503,14789,14772,15506,14794,15543,31615,15507,15508,14800}'::integer[]))
Rows Removed by Index Recheck: 5304550
Filter: (deleted_at IS NULL)
Rows Removed by Filter: 879414
-> Bitmap Index Scan on index_posts_on_source_id (cost=0.00..47624.73 rows=2596024 width=0) (actual time=602.744..602.744 rows=2553486 loops=1)
Index Cond: (source_id = ANY ('{14790,14787,32928,14796,14791,15503,14789,14772,15506,14794,15543,31615,15507,15508,14800}'::integer[]))
Total runtime: 136176.868 ms
Query #4
EXPLAIN ANALYZE SELECT posts.* FROM posts
WHERE posts.source_id IN (18766, 18130, 18128, 18129, 19705, 28252, 18264, 18126, 18767, 27603, 28657, 28654, 28655, 19706, 18330)
AND posts.deleted_at IS NULL
ORDER BY external_created_at desc;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=102648.92..102704.24 rows=22129 width=1089) (actual time=15225.250..15256.931 rows=51408 loops=1)
Sort Key: external_created_at
Sort Method: external merge Disk: 35456kB
-> Index Scan using index_posts_on_source_id on posts (cost=0.45..79869.91 rows=22129 width=1089) (actual time=3.975..14803.320 rows=51408 loops=1)
Index Cond: (source_id = ANY ('{18766,18130,18128,18129,19705,28252,18264,18126,18767,27603,28657,28654,28655,19706,18330}'::integer[]))
Filter: (deleted_at IS NULL)
Rows Removed by Filter: 54
Total runtime: 15397.453 ms
Postgres memory settings:
name, setting, unit
'default_statistics_target','100',''
'effective_cache_size','16384','8kB'
'maintenance_work_mem','16384','kB'
'max_connections','100',''
'random_page_cost','4',NULL
'seq_page_cost','1',NULL
'shared_buffers','16384','8kB'
'work_mem','1024','kB'
Database stats:
Total Posts: 20,997,027
Posts where deleted_at is null: 15,665,487
Distinct source_id's: 22,245
Max number of rows per single source_id: 1,543,950
Min number of rows per single source_id: 1
Most source_ids in a single query: 21
Distinct external_created_at: 11,146,151
Best Answer
General advice
All the general advice for performance optimization applies. Default settings are very conservative and some of these resource settings are way to low for tables with millions of rows (
work_mem
in particular). You need to configure your RDBMS to make use of available RAM wisely. The Postgres Wiki is a good starting point. This is beyond the scope of a single question here.However, the query I suggest below only needs very moderate resource settings.
Also increase the statistics target for
source_id
to have more detailed statistics on the crucial column:Then:
ANALYZE posts;
More:
You could optimize storage some more (for minor gains):
Query
The query itself is hard to optimize. Refer to @ypercube's related question for advanced performance optimization:
There is a simple method if ...
source_id
per query is reasonably smallLIMIT
is also reasonably small.... which is true for your case according to your added details.
The only index you need for the query below:
Example based on your query #1:
This is emulating a loose index scan, similar to what's discussed here in great detail:
If n is the number of source_id's (and luckily never > 21), we make Postgres fetch the top 100 rows (according to
external_created_at DESC
) for eachsource_id
from the index, which is very fast in itself, but max. (n-1) * 100 rows are surplus. Given your value frequencies:(You didn't clarify whether all of these numbers include "deleted" rows, but only ~ 25% are "deleted".)
... I'd expect some of the
source_id
's to have less than 100 rows to begin with. So we only have to sort 2100 rows in the worst case (typically much fewer) to keep the top 100. That shouldn't perform so badly - once you have configured Postgres with decent resource settings.If you have a source table holding all distinct
source_id
, it might make sense to use it and eliminate non-existentsource_id
early:A maximum of 21
IN
values is ok for this form, but consider this related question:You could optimize some more if you know the minimum
external_created_at
or the maximum number of rows from a singlesource_id
in the result ...