PostgreSQL Partitioning – Slow Performance When Using now() to Filter

I have a PostgreSQL v10 table partitioned on a timestamp field:

CREATE TABLE front.probes (
    bucket_start timestamp NOT NULL DEFAULT now(),
    -- Extra fields here ...
) PARTITION BY RANGE (bucket_start);

The partitions themselves are created as below:

CREATE TABLE IF NOT EXISTS front.probes_20190611
PARTITION OF front.probes
FOR VALUES FROM ('2019-06-11 00:00:00') TO ('2019-06-12 00:00:00');

If query this table by hard-coding my timestamp, it performs as expected (171ms).

Note: For consistency, the timestamp displayed below was pre-queried by running select now() at time zone 'utc'.

select min(bucket_start), max(bucket_start) 
from front.probes
where bucket_start < '2019-06-11 09:06:47'::timestamp - interval '30 minutes'
and bucket_start >= '2019-06-11 09:06:47'::timestamp - interval '70 minutes';

min                |max                |
-------------------|-------------------|
2019-06-11 08:00:00|2019-06-11 08:35:00|

(0.171s)

In contrast, if I use now() at time zone 'utc' directly in the query, it yeilds the same results, but performance degrades spectacularly (11.14s):

select min(bucket_start), max(bucket_start)
from front.probes
where bucket_start < now() at time zone 'utc' - interval '30 minutes'
and bucket_start >= now() at time zone 'utc' - interval '70 minutes';

min                |max                |
-------------------|-------------------|
2019-06-11 08:00:00|2019-06-11 08:35:00|

(11.140s)

The query plans show that the "fast" version takes advantage of partition constraints:

QUERY PLAN                                                                                                                                                                      |
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
Finalize Aggregate  (cost=27177.24..27177.25 rows=1 width=16)                                                                                                                   |
  ->  Gather  (cost=27177.02..27177.23 rows=2 width=16)                                                                                                                         |
        Workers Planned: 2                                                                                                                                                      |
        ->  Partial Aggregate  (cost=26177.02..26177.03 rows=1 width=16)                                                                                                        |
              ->  Append  (cost=0.00..24873.83 rows=260639 width=8)                                                                                                             |
                    ->  Parallel Seq Scan on probes_20190611  (cost=0.00..24873.83 rows=260639 width=8)                                                                         |
                          Filter: ((bucket_start < '2019-06-11 08:36:47'::timestamp without time zone) AND (bucket_start >= '2019-06-11 07:56:47'::timestamp without time zone))|

While the now() version doesn't:

QUERY PLAN                                                                                                                                                                |
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
Aggregate  (cost=2119587.51..2119587.52 rows=1 width=16)                                                                                                                  |
  ->  Append  (cost=0.00..2116094.94 rows=698514 width=8)                                                                                                                 |
        ->  Seq Scan on probes_20190605  (cost=0.00..713298.11 rows=1 width=8)                                                                                            |
              Filter: ((bucket_start < (timezone('utc'::text, now()) - '00:30:00'::interval)) AND (bucket_start >= (timezone('utc'::text, now()) - '01:10:00'::interval)))|
        (... full partition list, omitted for clarity ...)
        ->  Seq Scan on probes_20190617  (cost=0.00..21.40 rows=2 width=8)                                                                                                |
              Filter: ((bucket_start < (timezone('utc'::text, now()) - '00:30:00'::interval)) AND (bucket_start >= (timezone('utc'::text, now()) - '01:10:00'::interval)))|

Documentation shows that now() returns the date and time at the start of current transaction so I was expecting it to be resolved before the query plan is made, so that my query can take advantage of partitioning constraints, but it doesn't seem to be the case.

Is there any way to work around this? If possible we would like to avoid setting manual timestamps in "relative" time queries.

Best Answer

A PostgreSQL query is not necessarily executed in the same transaction as it is executed, think for example of prepared statements.

Consequently, now() cannot be evaluated at planning time.

PostgreSQL v11 can prune partitions at execution time, so your query should work as intended with PostgreSQL v11. You are out of luck with v10 though.

Best Answer

Related Solutions

Postgresql – Optimizing ORDER BY in a full text search query

Postgresql – How to speed up a Postgres query containing lots of Joins with an ILIKE condition

Related Question