PostgreSQL planner not using PK index

optimizationpostgresql

I have a query:

select F.DVALUE as f1, VF.VAL as f2, f_strlen( VF.VAL ) as f3
from PR_STR_TO_LIST( '37593' ) L                    
left join DUALS U on U.NUM >= 0 and U.NUM < 2      
left join D_B_CONT_SET_FIX S on U.NUM=1 and S.DOC=L.VAL
left join OD_DOCS D on D.ID = (case when U.NUM=0 then L.VAL else S.PORT_ID end)
left join OD_DOC_CATS C  on C.ID=D.D_CAT
left join OD_DOC_CATS CC on CC.WALK<=C.WALK and C.WALK<(CC.WALK+CC.AMOUNT) and CC.LEV<=C.LEV
left join OP_FIELDS F on F.OP=CC.ID and F.AS_NAME='DESC_ESTIM'
left join OD_OPTIONS VF on VF.OBJECT=D.ID and VF.DESCR=F.ID and VF.B_DATE<='20180715' and VF.E_DATE>'20180715'
where F.ID is not null
order by 3 desc, U.NUM, CC.LEV desc

function PR_STR_TO_LIST( '37593' ) is IMMUTABLE PARALLEL SAFE and return setof T_PR_STR_TO_LIST:

create type T_PR_STR_TO_LIST as (
  VAL integer
);

The query is slow, here it's the execution plan:

Sort  (cost=193656.73..193658.89 rows=866 width=18) (actual time=184.363..184.363 rows=1 loops=1)
  Sort Key: (f_strlen(vf.val)) DESC, u.num, d_2.lev DESC
  Sort Method: quicksort  Memory: 25kB
  ->  Nested Loop Left Join  (cost=104216.04..193614.47 rows=866 width=18) (actual time=184.351..184.354 rows=1 loops=1)
        ->  Hash Join  (cost=104215.62..186270.79 rows=866 width=19) (actual time=184.306..184.309 rows=1 loops=1)
              Hash Cond: (CASE WHEN (u.num = 0) THEN l.val ELSE s.port_id END = d.id)
              ->  Merge Left Join  (cost=84933.39..166862.95 rows=23390 width=12) (actual time=133.823..133.824 rows=2 loops=1)
                    Merge Cond: (l.val = s.doc)
                    Join Filter: (u.num = 1)
                    ->  Sort  (cost=1109.34..1141.84 rows=13000 width=8) (actual time=0.162..0.163 rows=2 loops=1)
                          Sort Key: l.val
                          Sort Method: quicksort  Memory: 25kB
                          ->  Nested Loop Left Join  (cost=0.25..221.03 rows=13000 width=8) (actual time=0.149..0.150 rows=2 loops=1)
                                ->  Function Scan on pr_str_to_list l  (cost=0.25..10.25 rows=1000 width=4) (actual time=0.117..0.117 rows=1 loops=1)
                                ->  Materialize  (cost=0.00..48.31 rows=13 width=4) (actual time=0.030..0.031 rows=2 loops=1)
                                      ->  Seq Scan on duals u  (cost=0.00..48.25 rows=13 width=4) (actual time=0.021..0.021 rows=2 loops=1)
                                            Filter: ((num >= 0) AND (num < 2))
                    ->  Sort  (cost=83824.05..85695.24 rows=748476 width=8) (actual time=133.656..133.656 rows=1 loops=1)
                          Sort Key: s.doc
                          Sort Method: quicksort  Memory: 56931kB
                          ->  Seq Scan on d_b_cont_set_fix s  (cost=0.00..10796.76 rows=748476 width=8) (actual time=0.012..40.324 rows=748476 loops=1)
              ->  Hash  (cost=18106.86..18106.86 rows=94030 width=15) (actual time=49.902..49.902 rows=97361 loops=1)
                    Buckets: 131072  Batches: 1  Memory Usage: 5588kB
                    ->  Nested Loop  (cost=1.26..18106.86 rows=94030 width=15) (actual time=0.054..28.764 rows=97361 loops=1)
                          ->  Nested Loop  (cost=0.83..128.58 rows=15 width=15) (actual time=0.036..0.134 rows=1 loops=1)
                                ->  Nested Loop  (cost=0.55..126.94 rows=1 width=19) (actual time=0.020..0.118 rows=1 loops=1)
                                      ->  Index Scan using op_fields_op_as_name_key on op_fields f  (cost=0.28..118.61 rows=1 width=11) (actual time=0.013..0.110 rows=1 loops=1)
                                            Index Cond: ((as_name)::text = 'DESC_ESTIM'::text)
                                            Filter: (id IS NOT NULL)
                                      ->  Index Scan using pk_od_doc_cats on fs_od_doc_cats d_2  (cost=0.27..8.29 rows=1 width=16) (actual time=0.004..0.004 rows=1 loops=1)
                                            Index Cond: (id = f.op)
                                ->  Index Scan using od_ix_doc_cats_walk on fs_od_doc_cats d_1  (cost=0.28..1.48 rows=15 width=12) (actual time=0.014..0.015 rows=1 loops=1)
                                      Index Cond: ((d_2.walk <= walk) AND (walk < (d_2.walk + d_2.amount)))
                                      Filter: (d_2.lev <= lev)
                          ->  Index Scan using x2x_od_docs on fs_od_docs d  (cost=0.43..952.06 rows=24649 width=8) (actual time=0.017..20.391 rows=97361 loops=1)
                                Index Cond: (d_cat = d_1.id)
        ->  Index Scan using uk_od_options on od_options vf  (cost=0.42..8.44 rows=1 width=11) (actual time=0.023..0.023 rows=0 loops=1)
              Index Cond: ((descr = f.id) AND (object = d.id) AND (b_date <= '2018-07-15 00:00:00'::timestamp without time zone))
              Filter: (e_date > '2018-07-15 00:00:00'::timestamp without time zone)
Planning time: 2.592 ms
Execution time: 196.785 ms

The problem is in row with a case statement. It makes hash join:

Hash Join  (cost=104215.62..186270.79 rows=866 width=19) (actual time=184.306..184.309 rows=1 loops=1)
Hash Cond: (CASE WHEN (u.num = 0) THEN l.val ELSE s.port_id END = d.id)

However it should be index Scan.
Planner ignores OD_DOCS.ID PK column Index. Also columns U.NUM, L.VAL, S.PORT_ID are int as well.

In order to check that I made some experiments. For my example I can get rid of the PR_STR_TO_LIST function and replace it's result with integer constant, query might be simplified like this:

select F.DVALUE as f1, VF.VAL as f2, f_strlen( VF.VAL ) as f3
from DUALS U        
left join D_B_CONT_SET_FIX S on U.NUM=1 and S.DOC=37593
left join OD_DOCS D on D.ID=(case when U.NUM=0 then 37593 else S.PORT_ID end)
left join OD_DOC_CATS C  on C.ID=D.D_CAT
left join OD_DOC_CATS CC on CC.WALK<=C.WALK and C.WALK<(CC.WALK+CC.AMOUNT) and CC.LEV<=C.LEV
left join OP_FIELDS F on F.OP=CC.ID and F.AS_NAME='DESC_ESTIM'
left join OD_OPTIONS VF on VF.OBJECT=D.ID and VF.DESCR=F.ID and VF.B_DATE<='20180715' and VF.E_DATE>'20180715'
where U.NUM >= 0 and U.NUM < 2 and F.ID is not null
order by 3 desc, U.NUM, CC.LEV desc

The plan:

Sort  (cost=678.66..678.67 rows=1 width=18) (actual time=0.310..0.310 rows=1 loops=1)
  Sort Key: (f_strlen(vf.val)) DESC, u.num, d_2.lev DESC
  Sort Method: quicksort  Memory: 25kB
  ->  Nested Loop Left Join  (cost=2.11..678.65 rows=1 width=18) (actual time=0.181..0.300 rows=1 loops=1)
        ->  Nested Loop  (cost=1.68..670.17 rows=1 width=19) (actual time=0.084..0.202 rows=1 loops=1)
              Join Filter: ((d_2.walk <= d_1.walk) AND (d_2.lev <= d_1.lev) AND (d_1.walk < (d_2.walk + d_2.amount)))
              ->  Nested Loop  (cost=0.55..126.94 rows=1 width=19) (actual time=0.028..0.143 rows=1 loops=1)
                    ->  Index Scan using op_fields_op_as_name_key on op_fields f  (cost=0.28..118.61 rows=1 width=11) (actual time=0.014..0.129 rows=1 loops=1)
                          Index Cond: ((as_name)::text = 'DESC_ESTIM'::text)
                          Filter: (id IS NOT NULL)
                    ->  Index Scan using pk_od_doc_cats on fs_od_doc_cats d_2  (cost=0.27..8.29 rows=1 width=16) (actual time=0.010..0.010 rows=1 loops=1)
                          Index Cond: (id = f.op)
              ->  Nested Loop  (cost=1.13..542.97 rows=13 width=16) (actual time=0.054..0.057 rows=1 loops=1)
                    ->  Nested Loop  (cost=0.86..539.20 rows=13 width=12) (actual time=0.048..0.051 rows=1 loops=1)
                          ->  Nested Loop Left Join  (cost=0.42..429.35 rows=13 width=8) (actual time=0.036..0.037 rows=2 loops=1)
                                Join Filter: (u.num = 1)
                                ->  Seq Scan on duals u  (cost=0.00..48.25 rows=13 width=4) (actual time=0.017..0.018 rows=2 loops=1)
                                      Filter: ((num >= 0) AND (num < 2))
                                ->  Materialize  (cost=0.42..343.56 rows=195 width=4) (actual time=0.009..0.009 rows=0 loops=2)
                                      ->  Index Only Scan using pk_d_b_cont_set_fix on d_b_cont_set_fix s  (cost=0.42..342.59 rows=195 width=4) (actual time=0.014..0.014 rows=0 loops=1)
                                            Index Cond: (doc = 37593)
                                            Heap Fetches: 0
                          ->  Index Scan using pk_od_docs on fs_od_docs d  (cost=0.43..8.45 rows=1 width=8) (actual time=0.006..0.006 rows=1 loops=2)
                                Index Cond: (id = CASE WHEN (u.num = 0) THEN 37593 ELSE s.port_id END)
                    ->  Index Scan using pk_od_doc_cats on fs_od_doc_cats d_1  (cost=0.27..0.29 rows=1 width=12) (actual time=0.005..0.005 rows=1 loops=1)
                          Index Cond: (id = d.d_cat)
        ->  Index Scan using uk_od_options on od_options vf  (cost=0.42..8.44 rows=1 width=11) (actual time=0.010..0.010 rows=0 loops=1)
              Index Cond: ((descr = f.id) AND (object = d.id) AND (b_date <= '2018-07-15 00:00:00'::timestamp without time zone))
              Filter: (e_date > '2018-07-15 00:00:00'::timestamp without time zone)
Planning time: 2.198 ms
Execution time: 0.461 ms

Now the query works 400 times faster. And in the same case the planner uses index scan:

Index Scan using pk_od_docs on fs_od_docs d  (cost=0.43..8.45 rows=1 width=8) (actual time=0.006..0.006 rows=1 loops=2)
Index Cond: (id = CASE WHEN (u.num = 0) THEN 37593 ELSE s.port_id END)

It should be mentioned that, OD_DOCS is a view for fs_od_docs.

So the question is how can I help the PostgreSQL planner in this case?

I'm using latest PostgreSQL 10 installation.

—– Update 1
Here is the PR_STR_TO_LISTcode, generally it just converts comma separated list.

create or replace function PR_STR_TO_LIST (STR varchar(2000))
returns setof T_PR_STR_TO_LIST
as $body$
declare
    P$  integer;
    L$  integer;out_rec T_PR_STR_TO_LIST;
    STR$ varchar(2000);
begin
    STR$ := STR;
    P$ := f_substr(',', STR$ );
    L$ := f_strlen(STR$ );
    while (P$>=0) loop
        out_rec.VAL := f_AsInteger(f_left( STR$, P$ ));
        L$ := L$-P$-1;
        STR$ := f_right( STR$, L$ );
        return next out_rec;
        P$ := f_substr( ',', STR$ );
    end loop;
    if (L$>0) then
        out_rec.VAL := f_AsInteger(STR$);
        return next out_rec;
    end if;
    return;
end /*PR_STR_TO_LIST*/;
$body$ LANGUAGE 'plpgsql' PARALLEL SAFE IMMUTABLE CALLED ON NULL INPUT SECURITY INVOKER COST 100;

create or replace function PR_STR_TO_LIST (
    STR integer
) RETURNS setof T_PR_STR_TO_LIST
as $body$
declare
  out_rec T_PR_STR_TO_LIST;
begin
  out_rec.VAL := STR;
  return next out_rec;
end
$body$ LANGUAGE 'plpgsql' PARALLEL SAFE IMMUTABLE CALLED ON NULL INPUT SECURITY INVOKER COST 100;

—– Update 2 I made some changes according to @a_horse_with_no_name answer, now query with inlined sql:

select F.DVALUE as f1, VF.VAL as f2, f_strlen( VF.VAL ) as f3
from unnest(string_to_array(nullif(trim('37594'),''), ',')::int[]) as L         
left join DUALS U on U.NUM >= 0 and U.NUM < 2      
left join D_B_CONT_SET_FIX S on U.NUM=1 and S.DOC=L
left join OD_DOCS D on D.ID = (case when U.NUM=0 then L else S.PORT_ID end)
...

Also one index was missing for D_B_CONT_SET_FIX table, so I fixed that. However the problem remains. Execution plan:

Sort  (cost=108502.17..108505.51 rows=1336 width=18) (actual time=51.349..51.349 rows=0 loops=1)
  Sort Key: (f_strlen(vf.val)) DESC, u.num, d_2.lev DESC
  Sort Method: quicksort  Memory: 25kB
  ->  Nested Loop Left Join  (cost=88804.42..108432.81 rows=1336 width=18) (actual time=51.344..51.344 rows=0 loops=1)
        ->  Hash Join  (cost=88804.00..104611.85 rows=1336 width=19) (actual time=51.343..51.343 rows=0 loops=1)
              Hash Cond: (CASE WHEN (u.num = 0) THEN l.l ELSE s.port_id END = d.id)
              ->  Nested Loop Left Join  (cost=0.44..15494.33 rows=36071 width=12) (actual time=0.062..0.075 rows=2 loops=1)
                    Join Filter: (u.num = 1)
                    ->  Nested Loop Left Join  (cost=0.01..31.73 rows=200 width=8) (actual time=0.052..0.056 rows=2 loops=1)
                          ->  Function Scan on unnest l  (cost=0.01..10.01 rows=100 width=4) (actual time=0.022..0.023 rows=1 loops=1)
                          ->  Materialize  (cost=0.00..1.22 rows=2 width=4) (actual time=0.028..0.031 rows=2 loops=1)
                                ->  Seq Scan on duals u  (cost=0.00..1.21 rows=2 width=4) (actual time=0.021..0.023 rows=2 loops=1)
                                      Filter: ((num >= 0) AND (num < 2))
                    ->  Index Only Scan using pk_d_b_cont_set_fix on d_b_cont_set_fix s  (cost=0.42..40.31 rows=361 width=8) (actual time=0.007..0.007 rows=0 loops=2)
                          Index Cond: (doc = l.l)
                          Heap Fetches: 0
              ->  Hash  (cost=79151.45..79151.45 rows=94167 width=15) (actual time=50.640..50.640 rows=97361 loops=1)
                    Buckets: 131072  Batches: 1  Memory Usage: 5588kB
                    ->  Nested Loop  (cost=1.26..79151.45 rows=94167 width=15) (actual time=0.186..29.367 rows=97361 loops=1)
                          ->  Nested Loop  (cost=0.83..169.92 rows=15 width=15) (actual time=0.169..0.347 rows=1 loops=1)
                                Join Filter: ((d_2.walk <= d_1.walk) AND (d_2.lev <= d_1.lev) AND (d_1.walk < (d_2.walk + d_2.amount)))
                                Rows Removed by Join Filter: 407
                                ->  Index Scan using pk_od_doc_cats on fs_od_doc_cats d_1  (cost=0.27..63.62 rows=408 width=12) (actual time=0.018..0.082 rows=408 loops=1)
                                ->  Materialize  (cost=0.55..60.40 rows=1 width=19) (actual time=0.000..0.000 rows=1 loops=408)
                                      ->  Nested Loop  (cost=0.55..60.39 rows=1 width=19) (actual time=0.030..0.137 rows=1 loops=1)
                                            ->  Index Scan using op_fields_op_as_name_key on op_fields f  (cost=0.28..57.80 rows=1 width=11) (actual time=0.015..0.123 rows=1 loops=1)
                                                  Index Cond: ((as_name)::text = 'DESC_ESTIM'::text)
                                                  Filter: (id IS NOT NULL)
                                            ->  Index Scan using pk_od_doc_cats on fs_od_doc_cats d_2  (cost=0.27..2.58 rows=1 width=16) (actual time=0.011..0.011 rows=1 loops=1)
                                                  Index Cond: (id = f.op)
                          ->  Index Scan using x2x_od_docs on fs_od_docs d  (cost=0.43..2797.04 rows=24684 width=8) (actual time=0.016..20.450 rows=97361 loops=1)
                                Index Cond: (d_cat = d_1.id)
        ->  Index Scan using uk_od_options on od_options vf  (cost=0.42..2.74 rows=1 width=11) (never executed)
              Index Cond: ((descr = f.id) AND (object = d.id) AND (b_date <= '2018-07-15 00:00:00'::timestamp without time zone))
              Filter: (e_date > '2018-07-15 00:00:00'::timestamp without time zone)
Planning time: 2.686 ms
Execution time: 53.637 ms

As for now, this is the solution:

create or replace function PR_STR_TO_LIST (STR varchar(2000))
returns table (val integer)
as $body$
begin
    RETURN QUERY select unnest(string_to_array(nullif(trim(STR),''), ',')::int[]);
end;
$body$ LANGUAGE 'plpgsql' PARALLEL SAFE IMMUTABLE CALLED ON NULL INPUT SECURITY INVOKER COST 5 rows 10;

Why it works and inlined doesn't – a mystery.

Best Answer

Because a PL/pgSQL is a black box to the planner it has no idea of how many rows the function will return.

The best solution is to rewrite the function a plain SQL function which could be inlined by the planner:

Assuming you never pass values that can't be converted to an integer, the following function is a much better choice than your complicated and slow PL/pgSQL function:

create or replace function PR_STR_TO_LIST (STR text)
  returns table (str integer)
as 
$body$
  select unnest(string_to_array(nullif(trim(str),''), ',')::int[]);
$body$
language sql
immutable;

For the callers of the function nothing changes. The function still returns a table (set) with a single integer column named str and the callers won't notice that the implementation changed (only that it got faster).


If for some strange reason you can't change the function's implementation, a workaround for the current inefficient implementation is to modify the number of expected rows for the planner. By default a function is declared with rows 1000 which means the planner expects the function to return 1000 rows (which is visible in your plan in the line Function Scan on pr_str_to_list l (cost=0.25..10.25 rows=1000 width=4).

To mitigate the effects of the inefficient PL/pgSQL implementation, you can reduce that number if you are certain that the function will return less rows in most of the cases by e.g. adding rows 10.

But the better solution is to not use PL/pgSQL for this and use the built-in functions to split a string into rows.


The best solution would be to not require such a function to begin with.