PostgreSQL 9.6 Performance – Partition Constraint Not Used for Joins by Timestamp

partitioningperformancepostgresqlpostgresql-9.6query-performance

I have a partitioned table structure like:

CREATE TABLE measurements (
    sensor_id bigint,
    tx timestamp,
    measurement int
);

CREATE TABLE measurements_201201(
    CHECK (tx >= '2012-01-01 00:00:00'::timestamp without time zone 
       AND tx < ('2012-01-01 00:00:00'::timestamp without time zone + '1 mon'::interval))    
)INHERITS (measurements);
CREATE INDEX ON measurements_201201(sensor_id);
CREATE INDEX ON measurements_201201(tx);
CREATE INDEX ON measurements_201201(sensor_id, tx);
....

And so on. Each table has approximately 20M rows.

If I query for a sample of sensors and a sample of timestamps in the WHERE clause, the query plan shows the correct tables being selected and indexes being usedm e.g.:

SELECT *
FROM measurements
INNER JOIN sensors TABLESAMPLE BERNOULLI (0.01) USING (sensor_id)
WHERE tx BETWEEN '2015-01-04 05:00' AND '2015-01-04 06:00' 
    OR tx BETWEEN '2015-02-04 05:00' AND '2015-02-04 06:00' 
    OR tx BETWEEN '2014-03-05 05:00' AND '2014-04-07 06:00' ;

However, if I use a CTE, or put the timestamp values into a table (not shown, even with indexes on the temporary table).

WITH sensor_sample AS(
    SELECT sensor_id, start_ts, end_ts
    FROM sensors TABLESAMPLE BERNOULLI (0.01)
    CROSS JOIN (VALUES (TIMESTAMP '2015-01-04 05:00', TIMESTAMP '2015-01-04 06:00'),
        (TIMESTAMP '2015-02-04 05:00', TIMESTAMP '2015-02-04 06:00'),
        (TIMESTAMP  '2014-03-05 05:00', '2014-04-07 06:00') ) tstamps(start_ts, end_ts)
)

Something like the below

SET constraint_exclusion = on;
SELECT * FROM measurements
INNER JOIN sensor_sample USING (sensor_id)
WHERE tx BETWEEN start_ts AND end_ts

Performs an index scan on every table. Which is still relatively fast, but with increasing complexity of queries, this can turn into seq scans which will end up being very slow for retrieving ~40K rows from a limited subset of partitioned tables (4-5 of 50).

I'm concerned that something like this is the problem.

For non-trivial expressions you have to repeat the more or less verbatim condition in queries to make the Postgres query planner understand it can rely on the CHECK constraint. Even if it seems redundant!

How can I improve partitioning and query structure to reduce the likelihood of running seq scans on all my data?

Best Answer

Constraint-based exclusion [CBE] is performed on early stage of query planning, just after the query is parsed, mapped to actual relations and rewritten. (internals, Planner/Optimizer stage)

The planner cannot assume any contents of "sensor_sample" table.

So unless you have values hardcoded in the query, the planner will not exclude "partitions".

I guess what happens with the CTE variant... the planner is restricted because you use TABLESAMPLE and the whole subquery may be treated as volatile even if literals in the subquery are static. (that's just my guess, I'm not expert on planner code)

On the bright side, the index scan with negative result is blazingly fast. (single page scan at most!) so unless you have over 10000 partitions, I would not bother.

So, to answer your question directly:

You cannot improve this data structure much more.
Regardin index scans - they are cheap;
Regarding sequential scans - they are avoided when possible, as you see on your own examples.

Related Solutions

Postgresql – Postgres not using the index even when rows returned is 5% of the table

From your query plans, it looks like you're comparing ints to ints in the first query plan, and int to numeric in the second plan.

Your first compare:

Index Cond: (("timestamp" >= 1431100800) AND ("timestamp" <= 1431108000))

and

timestamp >= 1431100800 and timestamp <= 1431108000

In the second query, it's numeric values:

Filter: ((numvalues[1] IS NOT NULL) AND (("timestamp")::numeric >= 1431100800.00) AND (("timestamp")::numeric <= 1431108000.00))

and

timestamp >= 1431093600.00 and timestamp <= 1431100800.00

Casting to numeric causes the index to be ignored in favor of a sequential scan.

You can see this with a very simple example, set up below:

CREATE TABLE t2 (a int);
CREATE INDEX t2_a_idx ON t2(a);
INSERT INTO t2 (a) SELECT i FROM generate_series(1,1000000) AS i;
VACUUM ANALYZE VERBOSE t2;

My first query plan looks like this:

EXPLAIN ANALYZE SELECT * FROM t2 WHERE a > 750000;

Index Only Scan using t2_a_idx on t2 (cost=0.42..7134.65 rows=250413 width=4) 
(actual time=0.019..29.926 rows=250000 loops=1)
Index Cond: (a > 750000)
Heap Fetches: 0
Planning time: 0.137 ms
Execution time: 39.114 ms
(5 rows)
Time: 39.540 ms

While a second query using numerics looks like this:

EXPLAIN ANALYZE SELECT * FROM t2 WHERE a > 750000.00;

Seq Scan on t2  (cost=0.00..19425.00 rows=333333 width=4) (actual time=122.803..175.326 rows=250000 loops=1)
Filter: ((a)::numeric > 750000.00)
Rows Removed by Filter: 750000
Planning time: 0.058 ms
Execution time: 184.194 ms
(5 rows)
Time: 184.487 ms

In the second instance here, the index is ignored in favor of a sequential scan because of the cast to a numeric value, which looks like exactly what's happening in your two examples.

One last aside, you might be able to speed your query up via a SET query before executing it as well:

SET work_mem = 2GB;

If your server can handle it, because your sorts are spilling to disk, as noted in this line from your query plan:

Sort Method: external merge  Disk: 1387704kB

Hope this helps. =)

Postgresql – How to merge partitions in Postgres

If you are still inserting data for 2014, then you risk problems with this method, because rows inserted between steps 2 and 5 are going to end up getting dropped rather than moved.

If you are not still inserting data for 2014, then I think you should change step 5 to "rewrite the trigger to throw an error upon insertion of 2014 data" and move it up to be step 0. That would remove the doubt.

But it does seem like you are doing a lot of tinkering. If you want to remove the partitioning for 2014, why would you want to keep it for 2015? Why did you implement partitioning in the first place and why is that reason no longer valid for 2014 (but still valid for 2015)? Getting rid of the partitioning might speed up the queries, but it might not. Reorganizing partitioned tables isn't something you should do on a hunch. Do you have a QA system you can use to time the queries and see if they get faster? And wouldn't you want faster queries for the newer data at least as much as for the older data?

Best Answer

Related Solutions

Postgresql – Postgres not using the index even when rows returned is 5% of the table

Postgresql – How to merge partitions in Postgres

Related Question