Postgresql – Postgres query planner issues

execution-planperformancepostgresqlquery-performance

There are two cases from different DBs and my main questions are:

  • why it is happening?
  • what can I do to stop it from happening?

OK, case #1

EXPLAIN (ANALYZE, BUFFERS)
 SELECT "notes"."id" AS t0_r0, ... "people"."id" AS t1_r0, ... 
   FROM "notes" 
  INNER JOIN "people" ON
        "people"."id" = "notes"."person_id" AND
        "people"."deleted_at" IS NULL 
  WHERE "notes"."deleted_at" IS NULL AND
        "notes"."company_id" = 2778 AND
        "people"."company_id" = 2778 AND
        "notes"."person_id" = 7691765 AND
        "people"."company_id" = 2778 AND
        "people"."salesperson_id" = 28166
  ORDER BY "notes"."created_at" DESC, "notes"."id" DESC
  LIMIT 2 OFFSET 0;

Plan

Limit  (cost=209.18..209.19 rows=1 width=1196) (actual time=45.609..45.610 rows=2 loops=1)
   Buffers: shared hit=71590
   ->  Sort  (cost=209.18..209.19 rows=1 width=1196) (actual time=45.608..45.608 rows=2 loops=1)
         Sort Key: notes.created_at, notes.id
         Sort Method: top-N heapsort  Memory: 26kB
         Buffers: shared hit=71590
         ->  Nested Loop  (cost=1.00..209.17 rows=1 width=1196) (actual time=3.143..45.533 rows=22 loops=1)
               Buffers: shared hit=71590
               ->  Index Scan using index_notes_on_person_id on notes  (cost=0.57..200.71 rows=1 width=219) (actual time=0.019..0.060 rows=22 loops=1)
                     Index Cond: (person_id = 7691765)
                     Filter: ((deleted_at IS NULL) AND (company_id = 2778))
                     Buffers: shared hit=24
               ->  Index Scan using index_people_on_company_id_and_salesperson_id on people  (cost=0.43..8.46 rows=1 width=977) (actual time=1.546..2.064 rows=1 loops=22)
                     Index Cond: ((company_id = 2778) AND (salesperson_id = 28166))
                     Filter: ((deleted_at IS NULL) AND (id = 7691765))
                     Rows Removed by Filter: 3331
                     Buffers: shared hit=71566

So while the planner knows the ID (that's a PK) of the record in the people table, it still wants to use index_people_on_company_id_and_salesperson_id index, instead of using PK.

If I remove the AND "people"."salesperson_id" = 28166 condition from the query it will happily use PK, and the query time drops to sub ms values:

Limit  (cost=209.63..209.63 rows=1 width=1195) (actual time=0.271..0.272 rows=2 loops=1)
 Buffers: shared hit=121 read=1
 ->  Sort  (cost=209.63..209.63 rows=1 width=1195) (actual time=0.270..0.271 rows=2 loops=1)
       Sort Key: notes.created_at, notes.id
       Sort Method: top-N heapsort  Memory: 26kB
       Buffers: shared hit=121 read=1
       ->  Nested Loop  (cost=1.00..209.62 rows=1 width=1195) (actual time=0.097..0.227 rows=24 loops=1)
             Buffers: shared hit=121 read=1
             ->  Index Scan using index_notes_on_person_id on notes  (cost=0.57..201.15 rows=1 width=219) (actual time=0.017..0.052 rows=24 loops=1)
                   Index Cond: (person_id = 7691765)
                   Filter: ((deleted_at IS NULL) AND (company_id = 2778))
                   Buffers: shared hit=26
             ->  Index Scan using index_people_on_company_id_and_id on people  (cost=0.43..8.46 rows=1 width=976) (actual time=0.006..0.006 rows=1 loops=24)
                   Index Cond: ((company_id = 2778) AND (id = 7691765))
                   Buffers: shared hit=95 read=1
Planning time: 0.503 ms
Execution time: 0.361 ms

Case #2

Query:

EXPLAIN (ANALYZE,BUFFERS)
 SELECT i.id, 'Institution' AS parent_type, COUNT(sm.id), MAX(i.id) AS institution_id, MAX(sm.id) AS last_message_id
   FROM public.institutions i
   LEFT JOIN public.sync_messages sm ON
        sm.parent_id = i.id AND
        sm.parent_type = 'Institution' AND
        sm.process_status = 0
  WHERE i.is_sync_live = TRUE AND                                                                                        
        (
           i.config->>'sync_source' != 'Location' OR
           i.config->>'sync_source' IS NULL
        )
  GROUP BY i.id;

Plan

HashAggregate  (cost=1245.84..1246.64 rows=80 width=8) (actual time=17.784..17.795 rows=22 loops=1)
   Group Key: i.id
   Buffers: shared hit=6202 read=3
   ->  Hash Right Join  (cost=59.77..1232.99 rows=1285 width=8) (actual time=0.594..17.714 rows=57 loops=1)
         Hash Cond: (sm.parent_id = i.id)
         Buffers: shared hit=6202 read=3
         ->  Index Scan using index_sync_messages_on_process_status on sync_messages sm  (cost=0.43..1149.29 rows=3069 width=8) (actual time=0.035..13.806 rows=7102 loops=1)
               Index Cond: (process_status = 0)
               Filter: ((parent_type)::text = 'Institution'::text)
               Rows Removed by Filter: 180
               Buffers: shared hit=6147 read=3
         ->  Hash  (cost=58.34..58.34 rows=80 width=4) (actual time=0.535..0.535 rows=22 loops=1)
               Buckets: 1024  Batches: 1  Memory Usage: 9kB
               Buffers: shared hit=55
               ->  Seq Scan on institutions i  (cost=0.00..58.34 rows=80 width=4) (actual time=0.279..0.517 rows=22 loops=1)
                     Filter: (is_sync_live AND (((config ->> 'sync_source'::text) <> 'Location'::text) OR ((config ->> 'sync_source'::text) IS NULL)))
                     Rows Removed by Filter: 169
                     Buffers: shared hit=55
 Planning time: 0.439 ms
 Execution time: 17.873 ms

That's the second run, first run gives something like 5s – 9s performance (ugh).

So this index it uses Index Scan using index_sync_messages_on_process_status on sync_messages – it is not perfect, there are only 3 values in the table for that column:

 process_status │  count  
────────────────┼─────────
              0 │    7207
              1 │ 1378395
              2 │      29

and I've created a better one, specifically for that query

CREATE INDEX CONCURRENTLY index_sync_messages_on_parent_type_and_parent_id_and_id_unprocessed_institutions 
    on sync_messages (parent_id) 
where parent_type = 'Institution' and process_status = 0;

Looks like a perfect match for that query, but the planner keeps ignoring this one. Why?

Best Answer

In case 1, it believes that both indexes are so selective that only one row will be found by either one. So they are tied. Ties are broken arbitrarily, by which index is found first (or maybe last?) in the catalog. You can't really see explicit evidence of this in your queries, because an EXPLAIN plan only shows the expected number of rows to be returned by the scan, not the number of rows expected to be found and then filtered out. The bad index plan actually had to filter 3331 rows, but the planner did not expect it to filter out any (as evidenced implicitly by the very low cost assigned). If you want to verify that, you could run:

explain analyze select * from people where company_id = 2778 AND
  salesperson_id = 28166

And you will probably see it expects one row, but gets 3332 (or maybe 3332/22)

Probably the reason that the planner is wrong about this is that there is a correlation between the company_id and salesperson_id columns. That is, either salespeople generally work only (or dominantly) with one company, or this particular salesperson does. It thinks those two selection criteria are independent, when they are not. If it is the first case where each salesperson generally works with just one company, you might be be able to fix the problem by creating multi-column statistics (except they don't exist until version 10):

create statistics alsdfjsf on company_id, salesperson_id from people

You could argue that PostgreSQL should know that the PK index can only possibly return 1 row, while the other one is expected to return 1 but has the possibility of returning more and so is a riskier plan to follow. But that is just a nuance that the planner does not implement, although it has been discussed extensively.