Postgresql – Postgres10 picks very slow query plan, when faster plans seem available

execution-planpostgresql

we are using Postgres 10.

We are seeing a situation where, for the same query, with one value it picks a super ineficient query plan, but with another value it picks a good query plan.

SELECT
   "sfe_procedure"."*"
FROM "sfe_procedure" INNER JOIN "sfe_claim"
   ON ("sfe_procedure"."claim_id" = "sfe_claim"."id")
WHERE (
   "sfe_procedure"."patient_id" = 100
   AND "sfe_claim"."from_date" >= '2011-01-01'::date
   AND "sfe_claim"."from_date" <= '2020-02-13'::date
   AND "sfe_procedure"."code" IN (
       SELECT U0."code"
       FROM "medical_codes_hediscode" U0
       WHERE U0."value_set_name" = 'Colonoscopy'
   )
)
ORDER BY "sfe_claim"."from_date" DESC LIMIT 1;

It picks this query plan:

 Limit  (cost=1001.30..5784.38 rows=1 width=46)
   ->  Nested Loop Semi Join  (cost=1001.30..2459500.48 rows=514 width=46)
         Join Filter: ((sfe_procedure.code)::text = (u0.code)::text)
         ->  Gather Merge  (cost=1000.89..2455885.11 rows=4369 width=46)
               Workers Planned: 2
               ->  Nested Loop  (cost=0.86..2454380.80 rows=1820 width=46)
                     ->  Parallel Index Scan Backward using sfe_claim_from_da_3d3cc6_idx on sfe_claim  (cost=0.43..771392.17 rows=1427744 width=8)
                           Index Cond: ((from_date >= '2011-01-01'::date) AND (from_date <= '2020-02-13'::date))
                     ->  Index Scan using sfe_procedure_claim_id_f0f8a72b on sfe_procedure  (cost=0.43..1.17 rows=1 width=42)
                           Index Cond: (claim_id = sfe_claim.id)
                           Filter: (patient_id = 100)
         ->  Materialize  (cost=0.42..11.08 rows=55 width=7)
               ->  Index Scan using medical_cod_value_s_e3a75d_idx on medical_codes_hediscode u0  (cost=0.42..10.81 rows=55 width=7)
                     Index Cond: ((value_set_name)::text = 'Colonoscopy'::text)
(14 rows)

Notice especially this row:

Parallel Index Scan Backward using sfe_claim_from_da_3d3cc6_idx 
  on sfe_claim  (cost=0.43..771392.17 rows=1427744 width=8)

Which seems to suggest that its best bet is to load 1.4 mil rows, and take 771 seconds? Am I reading this wrong?

On a different value for patient_id, the query plan is sane and the query is blazing fast:

Limit  (cost=4173.67..4173.67 rows=1 width=48)
  ->  Sort  (cost=4173.67..4174.12 rows=180 width=48)
        Sort Key: sfe_claim.from_date DESC
        ->  Nested Loop  (cost=18.73..4172.77 rows=180 width=48)
              ->  Hash Semi Join  (cost=18.29..2722.51 rows=180 width=44)
                    Hash Cond: ((sfe_procedure.code)::text = (u0.code)::text)
                    ->  Index Scan using sfe_procedure_patient_id_4bcdc9ad on sfe_procedure  (cost=0.43..2677.30 rows=3829 width=44)
                          Index Cond: (patient_id = 200)
                    ->  Hash  (cost=16.82..16.82 rows=83 width=7)
                          ->  Index Scan using medical_cod_value_s_e3a75d_idx on medical_codes_hediscode u0  (cost=0.42..16.82 rows=83 width=7)
                                Index Cond: ((value_set_name)::text = 'Colonoscopy'::text)"
              ->  Index Scan using sfe_claim_pkey on sfe_claim  (cost=0.43..8.06 rows=1 width=8)
                    Index Cond: (id = sfe_procedure.claim_id)
                    Filter: ((from_date >= '2011-01-01'::date) AND (from_date <= '2020-02-13'::date))"

The DB has had ANALYZE run on it recently, so it doesn't seem like the DB is confused about the amount of work that it's signing up for… but it still seems to pick that.

Questions

  • Is there a way to understand why other query plans were not picked?
  • Am I missing something obvious – a multiple-index doesn't seem to work – that could help Postgres see a better plan faster?
  • Do I need to rewrite the query?

Thank you for any help or pointers!

Best Answer

Which seems to suggest that its best bet is to load 1.4 mil rows, and take 771 seconds? Am I reading this wrong?

You are reading it wrong in two ways. It doesn't think it will read 1.4 mil rows. It thinks it will start a index scan which if completed will read 1.4 mil rows, but that due to the LIMIT 1 it will get to stop very early on. Apparently it is wrong about how early it can stop, but that is what it thinks.

Also, cost estimates are in arbitrary units, not in milliseconds, so it doesn't think it will take 771 seconds. And the cost at that level is again assuming the index scan runs to completion. The cost of the top LIMIT node, 5784.38 arbitrary units, has been scaled back for the expected benefit of stopping early, while the costs of the child nodes have not been.

Presumably patient 100 has not had any claims in a long time (like maybe they died or moved away a couple decade ago), while PostgreSQL thinks their claims will be evenly interspersed through time.

It is hard to say more than that, without seeing the output of EXPLAIN (ANALYZE) rather than just EXPLAIN.

Am I missing something obvious - a multiple-index doesn't seem to work - that could help Postgres see a better plan faster?

What multiple-index did you try? I would think one on sfe_procedure (patient_id, code) would have a reasonable chance to work.

But you can just hit the planner with a big hammer by rewriting it to:

ORDER BY "sfe_claim"."from_date" + interval '0 days' DESC LIMIT 1

This will preclude use of the falsely appealing index "sfe_claim_from_da_3d3cc6_idx"