Postgresql – Postgres 8.4.5: Bad query plan vs Good query plan when changing date range on large data table

performancepostgresqlquery-performance

I have a table with 13.7 million rows. Up until recently, queries on this table have been quite fast. Suddenly they are slow. More specifically, the same query is fast when looking at a large date range, but crazy slow when looking at a small date range. Here are the two statements and their EXPLAIN ANALYZE output.

I am using Postgresql 8.4.5 with default_statistics_target set to the default 100.

It's obviously choosing a bad query plan for the shorter date range, but how do I resolve this? And how do I interpret the query plan?

Fast query:

EXPLAIN ANALYZE SELECT pp.id as x, pp.firstname || ' ' || pp.lastname || CASE WHEN pp.active = 'f' OR pp.userid IS NULL THEN '*'  ELSE '' END as name
FROM time t
INNER JOIN person pp ON t.personid = pp.id AND pp.visible = 't'
INNER JOIN project p ON t.projectid = p.id AND p.visible = 't'
WHERE true AND t.visible <> 'f' AND p.customerid = 9
AND t.date >= '01/11/2013 00:00:00' AND t.date <= '03/17/2013 23:59:59'
GROUP BY pp.id, pp.firstname, pp.lastname, pp.active, pp.userid
ORDER BY name ASC NULLS FIRST;
                                                                             QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Sort  (cost=59930.72..59934.61 rows=1556 width=22) (actual time=69.718..69.719 rows=4 loops=1)
   Sort Key: (((((pp.firstname)::text || ' '::text) || (pp.lastname)::text) || CASE WHEN ((NOT pp.active) OR (pp.userid IS NULL)) THEN '*'::text ELSE ''::text END))
   Sort Method:  quicksort  Memory: 25kB
   ->  HashAggregate  (cost=59821.00..59848.23 rows=1556 width=22) (actual time=69.701..69.708 rows=4 loops=1)
         ->  Nested Loop  (cost=0.00..59801.55 rows=1556 width=22) (actual time=0.069..68.817 rows=1460 loops=1)
               ->  Nested Loop  (cost=0.00..59334.43 rows=1556 width=4) (actual time=0.056..64.660 rows=1460 loops=1)
                     ->  Index Scan using project_fkindex1 on project p  (cost=0.00..9.82 rows=38 width=4) (actual time=0.022..0.244 rows=361 loops=1)
                           Index Cond: (customerid = 9)
                           Filter: visible
                     ->  Index Scan using time_fkindex6 on "time" t  (cost=0.00..1560.66 rows=41 width=8) (actual time=0.063..0.177 rows=4 loops=361)
                           Index Cond: (t.projectid = p.id)
                           Filter: ((t.visible <> false) AND (t.date >= '2013-01-11'::date) AND (t.date <= '2013-03-17'::date))
               ->  Index Scan using person_pkey on person pp  (cost=0.00..0.29 rows=1 width=22) (actual time=0.002..0.002 rows=1 loops=1460)
                     Index Cond: (pp.id = t.personid)
                     Filter: pp.visible
 Total runtime: 69.817 ms
(16 rows)

Time: 71.354 ms

And the SLOW query:

EXPLAIN ANALYZE SELECT pp.id as x, pp.firstname || ' ' || pp.lastname || CASE WHEN pp.active = 'f' OR pp.userid IS NULL THEN '*'  ELSE '' END as name
FROM time t
INNER JOIN person pp ON t.personid = pp.id AND pp.visible = 't'
INNER JOIN project p ON t.projectid = p.id AND p.visible = 't'
WHERE true AND t.visible <> 'f' AND p.customerid = 9
AND t.date >= '03/11/2013 00:00:00' AND t.date <= '03/17/2013 23:59:59'
GROUP BY pp.id, pp.firstname, pp.lastname, pp.active, pp.userid
ORDER BY name ASC NULLS FIRST;
                                                                             QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Sort  (cost=25428.46..25428.56 rows=39 width=22) (actual time=31366.375..31366.376 rows=4 loops=1)
   Sort Key: (((((pp.firstname)::text || ' '::text) || (pp.lastname)::text) || CASE WHEN ((NOT pp.active) OR (pp.userid IS NULL)) THEN '*'::text ELSE ''::text END))
   Sort Method:  quicksort  Memory: 25kB
   ->  HashAggregate  (cost=25426.75..25427.43 rows=39 width=22) (actual time=31366.363..31366.366 rows=4 loops=1)
         ->  Nested Loop  (cost=664.21..25426.26 rows=39 width=22) (actual time=95.334..31366.243 rows=118 loops=1)
               ->  Nested Loop  (cost=664.21..25403.10 rows=39 width=4) (actual time=95.317..31365.758 rows=118 loops=1)
                     ->  Index Scan using project_fkindex1 on project p  (cost=0.00..9.82 rows=38 width=4) (actual time=0.022..1.174 rows=361 loops=1)
                           Index Cond: (customerid = 9)
                           Filter: visible
                     ->  Bitmap Heap Scan on "time" t  (cost=664.21..668.23 rows=1 width=8) (actual time=86.871..86.877 rows=0 loops=361)
                           Recheck Cond: ((t.projectid = p.id) AND (t.date >= '2013-03-11'::date) AND (t.date <= '2013-03-17'::date))
                           Filter: (t.visible <> false)
                           ->  BitmapAnd  (cost=664.21..664.21 rows=1 width=0) (actual time=86.865..86.865 rows=0 loops=361)
                                 ->  Bitmap Index Scan on time_fkindex6  (cost=0.00..18.81 rows=584 width=0) (actual time=1.017..1.017 rows=290 loops=361)
                                       Index Cond: (t.projectid = p.id)
                                 ->  Bitmap Index Scan on time_index2057  (cost=0.00..639.15 rows=25008 width=0) (actual time=92.211..92.211 rows=108649 loops=336)
                                       Index Cond: ((t.date >= '2013-03-11'::date) AND (t.date <= '2013-03-17'::date))
               ->  Index Scan using person_pkey on person pp  (cost=0.00..0.58 rows=1 width=22) (actual time=0.003..0.003 rows=1 loops=118)
                     Index Cond: (pp.id = t.personid)
                     Filter: pp.visible
 Total runtime: 31366.469 ms
(21 rows)

Time: 31367.644 ms

UPDATE: Here is what the pg_stats table is telling me. It looks like increasing the number of values it can hold may help:

SELECT attname, n_distinct, most_common_vals
FROM pg_stats
WHERE tablename = 'time' and attname = 'date';
 attname | n_distinct |


                                               most_common_vals



---------+------------+------------------------------------------------------------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------
 date    |       1899 | {2012-09-13,2013-02-05,2013-03-04,2011-03-14,2012-10-03,2013-01-14,2012-11-20,2013-02-07,2012-10-22,2013-01-08,2013-02-01,2013-02-06,2012-12-13,2013-
01-09,2012-12-04,2013-01-16,2012-05-31,2012-09-28,2013-02-08,2012-02-09,2012-08-23,2012-11-19,2012-12-06,2012-12-12,2013-01-07,2013-02-22,2013-02-26,2013-03-01,2012-05-17,20
12-07-16,2012-10-15,2012-10-25,2012-11-12,2013-02-11,2013-02-25,2012-09-18,2012-11-05,2012-11-27,2013-02-13,2013-03-07,2012-03-29,2012-06-05,2012-07-10,2012-09-05,2012-11-13
,2012-11-14,2012-12-03,2013-01-28,2013-02-18,2013-02-20,2012-04-24,2012-04-25,2012-05-22,2012-08-30,2012-10-05,2012-11-08,2013-01-18,2013-02-04,2013-02-14,2013-03-14,2012-04
-04,2012-06-11,2012-08-06,2012-08-31,2012-09-20,2012-09-25,2012-10-16,2013-01-22,2012-05-30,2012-08-01,2012-08-09,2012-10-19,2012-12-14,2013-02-12,2013-03-06,2012-02-08,2012
-05-14,2012-07-05,2012-07-24,2012-07-25,2012-08-27,2012-11-26,2012-12-19,2013-01-10,2013-01-21,2013-01-30,2011-11-30,2012-01-05,2012-04-16,2012-05-03,2012-05-04,2012-08-03,2
012-08-07,2012-09-07,2012-10-26,2012-10-30,2012-12-05,2012-12-21,2011-07-06,2012-02-01}
(1 row)

Time: 30.530 ms

Best Answer

This is a "community wiki" answer provided by the OP, which I've removed from the question.


SOLVED.

I was able to resolve this issue by altering the time.date column to allow 1000 stats: ALTER TABLE time ALTER COLUMN date SET STATISTICS 1000;

Now the queries are running very fast, down to 300ms from 5s.

I'll be looking into patching postgres as well.