Postgresql – How to change table/indexes structure for a query with two aggregated columns

performancepostgresqlpostgresql-9.1

Do you have any suggestion on how to improve a table structure or its indexes in order to have a single query run faster?

The table has this structure:

    docid          | character varying(17) | not null
    seqnr          | numeric               | not null default 1
    creatinguser   | character varying(50) | not null
    creatingdate   | date                  | 
    lastchangeuser | character varying(50) | not null
    lastchangedate | date                  | 
    description    | character varying(50) | 

Indexes:

    "systemdatabydocument_pkey" PRIMARY KEY, btree (docid, seqnr)
    "sdbdchanged" btree (lastchangedate)
    "sdbdcreated" btree (creatingdate)

My query (please note that the value in WHERE changes) is:

SELECT MAX(creatingDate), MAX(lastChangeDate)
FROM agenzia.SystemDataByDocument
WHERE docId = 'Sedi';

and explain analyze on a old/slow system with about 250k records on postgresql 8.3.17 is:

 Result  (cost=99.05..99.06 rows=1 width=0) (actual time=573.767..573.768 rows=1 loops=1)
   InitPlan
     ->  Limit  (cost=0.00..39.81 rows=1 width=8) (actual time=280.417..280.418 rows=1 loops=1)
       ->  Index Scan Backward using sdbdcreated on systemdatabydocument  (cost=0.00..22887.95 rows=575 width=8) (actual time=280.414..280.414 rows=1 loops=1)
             Filter: ((creatingdate IS NOT NULL) AND ((docid)::text = 'Sedi'::text))
     ->  Limit  (cost=0.00..59.24 rows=1 width=8) (actual time=293.332..293.333 rows=1 loops=1)
       ->  Index Scan Backward using sdbdchanged on systemdatabydocument  (cost=0.00..34064.51 rows=575 width=8) (actual time=293.328..293.328 rows=1 loops=1)
             Filter: ((lastchangedate IS NOT NULL) AND ((docid)::text = 'Sedi'::text))
 Total runtime: 573.837 ms

while the same plan on postgresql 9.1.11 (on a new/faster machine with about 500k records) is almost the same:

 Result  (cost=117.40..117.41 rows=1 width=0) (actual time=537.812..537.813 rows=1 loops=1)
   InitPlan 1 (returns $0)
     ->  Limit  (cost=0.00..59.73 rows=1 width=4) (actual time=279.532..279.532 rows=0 loops=1)
       ->  Index Scan Backward using sdbdcreated on systemdatabydocument  (cost=0.00..30342.46 rows=508 width=4) (actual time=279.525..279.525 rows=0 loops=1)
             Index Cond: (creatingdate IS NOT NULL)
             Filter: ((docid)::text = 'Sedi'::text)
   InitPlan 2 (returns $1)
     ->  Limit  (cost=0.00..57.67 rows=1 width=4) (actual time=258.270..258.270 rows=0 loops=1)
       ->  Index Scan Backward using sdbdchanged on systemdatabydocument  (cost=0.00..29296.76 rows=508 width=4) (actual time=258.267..258.267 rows=0 loops=1)
             Index Cond: (lastchangedate IS NOT NULL)
             Filter: ((docid)::text = 'Sedi'::text)
 Total runtime: 537.924 ms

Thank you very much.

Addendum 1: adding an index with all fields does not help:

=>create index testtesttest3 on agenzia.systemdatabydocument (docid,creatingDate,lastChangeDate);
CREATE INDEX
Tempo: 2180,780 ms
=> explain analyze SELECT MAX(creatingDate), MAX(lastChangeDate) FROM agenzia.SystemDataByDocument WHERE docId = 'Sedi';
                                                                        QUERY PLAN                                                                             
-------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Result  (cost=62.92..62.93 rows=1 width=0) (actual time=281.283..281.284 rows=1 loops=1)
   InitPlan
     ->  Limit  (cost=0.00..3.68 rows=1 width=8) (actual time=0.160..0.161 rows=1 loops=1)
       ->  Index Scan Backward using testtesttest3 on systemdatabydocument  (cost=0.00..2118.04 rows=575 width=8) (actual time=0.157..0.157 rows=1 loops=1)
             Index Cond: ((docid)::text = 'Sedi'::text)
             Filter: (creatingdate IS NOT NULL)
     ->  Limit  (cost=0.00..59.24 rows=1 width=8) (actual time=281.109..281.110 rows=1 loops=1)
       ->  Index Scan Backward using sdbdchanged on systemdatabydocument  (cost=0.00..34063.25 rows=575 width=8) (actual time=281.106..281.106 rows=1 loops=1)
             Filter: ((lastchangedate IS NOT NULL) AND ((docid)::text = 'Sedi'::text))
 Total runtime: 281.365 ms
(10 rows)

Best Answer

As suggested by @Mihai, adding an index that include docid as first column help. The best solution is to create two of them, one for (docid,creatingDate) and the second one for (docid,lastChangeDate). Please note for simplyfing the index scan, the second fild is ordered DESCENDING, infact the first two indexes are ignored. The new plan is now:

=# create index testtesttest1 on agenzia.systemdatabydocument (docid,creatingDate);
CREATE INDEX
=# create index testtesttest2 on agenzia.systemdatabydocument (docid,lastChangeDate);
CREATE INDEX
=# create index testtesttest3 on agenzia.systemdatabydocument (docid,creatingDate DESC);
CREATE INDEX
=# create index testtesttest4 on agenzia.systemdatabydocument (docid,lastChangeDate DESC);
CREATE INDEX
=# explain analyze SELECT MAX(creatingDate), MAX(lastChangeDate) FROM agenzia.SystemDataByDocument WHERE docId = 'Sedi';
                                                                  QUERY PLAN                                                                       
-------------------------------------------------------------------------------------------------------------------------------------------------------
 Result  (cost=5.40..5.41 rows=1 width=0) (actual time=0.151..0.151 rows=1 loops=1)
   InitPlan 1 (returns $0)
     ->  Limit  (cost=0.00..2.70 rows=1 width=4) (actual time=0.091..0.091 rows=0 loops=1)
       ->  Index Scan using testtesttest3 on systemdatabydocument  (cost=0.00..1372.58 rows=508 width=4) (actual time=0.085..0.085 rows=0 loops=1)
             Index Cond: (((docid)::text = 'Sedi'::text) AND (creatingdate IS NOT NULL))
   InitPlan 2 (returns $1)
     ->  Limit  (cost=0.00..2.70 rows=1 width=4) (actual time=0.054..0.054 rows=0 loops=1)
       ->  Index Scan using testtesttest4 on systemdatabydocument  (cost=0.00..1372.58 rows=508 width=4) (actual time=0.053..0.053 rows=0 loops=1)
             Index Cond: (((docid)::text = 'Sedi'::text) AND (lastchangedate IS NOT NULL))
 Total runtime: 0.256 ms