Postgresql low performance with max and min (wrong index?)

amazon-rdspostgresql

I'd appreciate some help here, I'm new with Postgresql and don't fully understand what's happening here.

I have the following table:

history=> select * from last_sales.store01 limit 5;  
     item  |  timestamp   |        price         
  ---------+---------------+----------------------
 pants001  | 1579686380060 | 168.7500000000000000
 shoes001  | 1579686382376 |   0.0000278500000000
 tshirt001 | 1579686381610 |   0.0194620000000000
 pants001  | 1579686385421 |  18.1308000000000000
 jacket001 | 1579686386952 |   0.0000271400000000

With the following indexes:

history=> SELECT * FROM pg_indexes where tablename='store01'; 
schemaname  | tablename |                  indexname                 | tablespace |                                                        indexdef
-------------+-----------+-------------------------------------------+------------+-------------------------------------------------------------------------------------------------------------------------------------
 last_sales | store01   | store01_item_timestamp_price_key           |            |  CREATE UNIQUE INDEX store01_item_timestamp_price_key ON last_sales.store01 USING btree (item, "timestamp", price)
 last_sales | store01   | store01_item_timestamp_desc_key            |            | CREATE INDEX store01_item_timestamp_desc_key ON last_sales.store01 USING btree (item, "timestamp" DESC)
 last_sales | store01   | store01_item_timestamp_desc_price_key      |            | CREATE INDEX store01_item_timestamp_desc_price_key ON last_sales.store01 USING btree (item, "timestamp" DESC, price)
 last_sales | store01   | store01_price_item_timestamp_desc_key      |            | CREATE INDEX store01_price_item_timestamp_desc_key ON last_sales.store01 USING btree (price, item, "timestamp" DESC)
 last_sales | store01   | store01_price_desc_item_timestamp_desc_key |            | CREATE INDEX store01_price_desc_item_timestamp_desc_key ON last_sales.store01 USING btree (price DESC, item, "timestamp" DESC)
 last_sales | store01   | store01_price_asc_item_timestamp_desc_key  |            | CREATE INDEX store01_price_asc_item_timestamp_desc_key ON last_sales.store01 USING btree (price, item, "timestamp" DESC) 
(6 rows)

The table has 5,321,707 entries.

And these are the queries that I'm performing:

history=> explain analyze select max(price) as maxprice from last_sales.store01 where item = 'shoe001' and timestamp >= 1579860457181;
                                                                                          QUERY PLAN

----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Result  (cost=734.23..734.24 rows=1 width=32) (actual time=2212.094..2212.095 rows=1 loops=1)
   InitPlan 1 (returns $0)
     ->  Limit  (cost=0.56..734.23 rows=1 width=6) (actual time=2212.084..2212.086 rows=1 loops=1)
           ->  Index Only Scan Backward using store01_price_asc_item_timestamp_desc_key on store01 (cost=0.56..173147.88 rows=236 width=6) (actual time=2212.082..2212.082 rows=1 loops=1)
                 Index Cond: ((price IS NOT NULL) AND (item = 'shoe001'::text) AND ("timestamp" >= '1579860457181'::bigint))
                 Heap Fetches: 1
 Planning Time: 8.987 ms
 Execution Time: 2212.129 ms
(8 rows)

And almost the same one but with min, instead of max:

history=> explain analyze select min(price) as maxprice from last_sales.store01 where item = 'pants001' and timestamp >= 1579860457181;
                                                                                      QUERY PLAN

-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Result  (cost=734.23..734.24 rows=1 width=32) (actual time=1907.985..1907.986 rows=1 loops=1)
   InitPlan 1 (returns $0)
      ->  Limit  (cost=0.56..734.23 rows=1 width=6) (actual time=1907.976..1907.978 rows=1 loops=1)
            ->  Index Only Scan using store01_price_asc_item_timestamp_desc_key on store01 (cost=0.56..173147.88 rows=236 width=6) (actual time=1907.974..1907.975 rows=1 loops=1)
                  Index Cond: ((price IS NOT NULL) AND (item = 'pants001'::text) AND ("timestamp" >= '1579860457181'::bigint))
                  Heap Fetches: 1
Planning Time: 0.227 ms
Execution Time: 1908.015 ms
(8 rows)

I think the performance of both queries, is poor (both around 2k ms). This is a test DB, in an AWS RDS t2.micro, with no activity other than the query.

Also, is using the same index for min and max?

Best Answer

First step is to VACUUM ANALYZE the table to make sure the table isn't full of dead tuples and the statistics are up to day.

For a btree index to be maximally efficient, all but one of the conditions on it need to be equality. You have one equality, one inequality, and one max, which is pretty much the same as an inequality. Also, all the equality conditions need to be on the leading columns.

The index which is most likely to be most useful is going to put the equality first, so that would be either (item, price, timestamp) or (item, timestamp, price) You already have the last of those, so try adding the first and see if it gets used. But, perhaps the last of those really is the best, and the planner is incorrectly not choosing it. So another thing you can try is dropping the indexes starting with price, and seeing what it does then. Unless your "price" column is mostly NULL, it is hard to see what makes the planner think that this is a good index to use.

PostgreSQL knows how to follow an index either forwards or backwards, so in general it can use the same index for both min and max.