I have the following:
- a complex query, 'match', that returns about 200 rows (ids)
- a huge table, 'usernames', indexed on id that maps them to usernames
When I try to left join the query (1) with the table (2) the planner decides to do a seq scan on the (2) table. This ends up taking about 90ms on my machine.
i.e:
-> CTE Scan on match l (cost=0.00..3349.54 rows=167477 width=4) (actual time=3.458..5.188 rows=200 loops=1)
-> Hash (cost=1638.00..1638.00 rows=100000 width=18) (actual time=66.537..66.537 rows=100000 loops=1)
Buckets: 8192 Batches: 2 Memory Usage: 2542kB
-> Seq Scan on usernames r (cost=0.00..1638.00 rows=100000 width=18) (actual time=0.019..27.595 rows=100000 loops=1)
Planning time: 16.261 ms
Execution time: 77.425 ms
If I add ('limit 200') in the query (1), the query time drops down to 15ms
-> CTE Scan on match l (cost=0.00..4.00 rows=200 width=4) (actual time=3.623..6.163 rows=200 loops=1)
-> Index Scan using usernames_userid_idx on usernames r (cost=0.29..6.71 rows=1 width=18) (actual time=0.017..0.017 rows=1 loops=200)
Index Cond: (l.id = userid)
Planning time: 4.682 ms
Execution time: 10.093 ms
I understand that this is due to the planned not estimating correctly the number of returned rows of my 'match' query.
Is there anyway to hint the planner to do an Index Scan on the join, instead of the Seq Scan?
The complete query is:
EXPLAIN ANALYSE
WITH match AS
(SELECT
dist_match.match_userid AS id
FROM
(SELECT
me.into_game AS into_game, r5.miles, r5.useridy AS match_userid
FROM
(SELECT l.id AS userid,
category,
r2.game,
r3.game AS into_game,
distance AS into_distance
FROM userids l
LEFT JOIN category r1 ON l.id = r1.userid
LEFT JOIN games r2 ON l.id = r2.userid
LEFT JOIN into_games r3 ON l.id = r3.userid
LEFT JOIN into_distances r4 ON l.id = r4.userid
WHERE l.id = 68931) AS me
LEFT JOIN user_distances r5 ON me.userid = r5.useridx
WHERE r5.miles < me.into_distance) AS dist_match
LEFT JOIN games r6 ON (r6.userid = dist_match.match_userid
AND r6.game = dist_match.into_game) ***LIMIT 200***)
SELECT l.id,name FROM match l LEFT JOIN usernames r ON l.id = r.userid;
Best Answer
You can influence the planner with
to see if it helps.
See more at http://www.postgresql.org/docs/9.5/static/runtime-config-query.html