Postgresql – How to influence the planner on Postgresql

performancepostgresqlpostgresql-performance

I have the following:

  1. a complex query, 'match', that returns about 200 rows (ids)
  2. a huge table, 'usernames', indexed on id that maps them to usernames

When I try to left join the query (1) with the table (2) the planner decides to do a seq scan on the (2) table. This ends up taking about 90ms on my machine.

i.e:

   ->  CTE Scan on match l  (cost=0.00..3349.54 rows=167477 width=4) (actual time=3.458..5.188 rows=200 loops=1)
   ->  Hash  (cost=1638.00..1638.00 rows=100000 width=18) (actual time=66.537..66.537 rows=100000 loops=1)
         Buckets: 8192  Batches: 2  Memory Usage: 2542kB
         ->  Seq Scan on usernames r  (cost=0.00..1638.00 rows=100000 width=18) (actual time=0.019..27.595 rows=100000 loops=1)
 Planning time: 16.261 ms
 Execution time: 77.425 ms

If I add ('limit 200') in the query (1), the query time drops down to 15ms

   ->  CTE Scan on match l  (cost=0.00..4.00 rows=200 width=4) (actual time=3.623..6.163 rows=200 loops=1)
   ->  Index Scan using usernames_userid_idx on usernames r  (cost=0.29..6.71 rows=1 width=18) (actual time=0.017..0.017 rows=1 loops=200)
         Index Cond: (l.id = userid)
 Planning time: 4.682 ms
 Execution time: 10.093 ms

I understand that this is due to the planned not estimating correctly the number of returned rows of my 'match' query.

Is there anyway to hint the planner to do an Index Scan on the join, instead of the Seq Scan?

The complete query is:

EXPLAIN ANALYSE 

WITH match AS

 (SELECT
   dist_match.match_userid AS id
     FROM

      (SELECT
        me.into_game AS into_game, r5.miles, r5.useridy AS match_userid
        FROM 
          (SELECT l.id AS userid,
           category,
           r2.game,
           r3.game AS into_game,
           distance AS into_distance
           FROM userids l 
           LEFT JOIN category r1 ON l.id = r1.userid
           LEFT JOIN games r2 ON l.id = r2.userid
           LEFT JOIN into_games r3 ON l.id = r3.userid
           LEFT JOIN into_distances r4 ON l.id = r4.userid 
           WHERE l.id = 68931) AS me
         LEFT JOIN user_distances r5 ON me.userid = r5.useridx
         WHERE r5.miles < me.into_distance) AS dist_match
       LEFT JOIN games r6 ON (r6.userid = dist_match.match_userid
       AND r6.game = dist_match.into_game) ***LIMIT 200***)

SELECT l.id,name FROM match l LEFT JOIN usernames r ON l.id = r.userid;

Best Answer

You can influence the planner with

set local enable_seqscan = off;

to see if it helps.

See more at http://www.postgresql.org/docs/9.5/static/runtime-config-query.html