Postgresql – Slow Postgres Aggregate Query

optimizationperformancepostgresqlqueryquery-performance

I need to calculate values for a series of measurement data. The measurement data is divided into 1000 groups of consecutive rows as each call of the query must output a series of 1000 values. Typical input sets will contain between 10,000 to 1,000,000 rows. The following solution is the best idea i could think of. It works fine, but execution is rather slow. Because one of my requirements is to trigger the calculation quite often i need to optimize for execution time. Unfortunately it’s not an option to precalculate the values for the individual groups because every new measurement row affects the group size.

Schema Setup

create table devices
(
    id varchar not null
        constraint devices_pkey
            primary key
);


create table processes
(
    id        integer not null,

    constraint processes_pkey
        primary key (id, device_id),

    device_id varchar not null
        constraint fk_processes_devices
            references devices
            on delete cascade
);

create index processes_device_id_idx on processes (device_id);


create table measurements
(
    timestamp  timestamp with time zone not null,
    current    real                     not null,
    process_id integer                  not null,
    device_id  varchar                  not null,

    constraint measurements_pkey
        primary key (timestamp, process_id, device_id),

    constraint fk_measurements_processes
        foreign key (process_id, device_id) references processes
            on delete cascade
);

create index measurements_process_id_device_id_idx on measurements (device_id, process_id);


INSERT INTO devices (id) VALUES ('123');

INSERT INTO processes (id, device_id) VALUES (456, '123');

WITH numbers AS (
  SELECT *
  FROM generate_series(1, 1000000)
)

INSERT INTO measurements (timestamp, current, process_id, device_id)
SELECT NOW() + (generate_series * interval '1 second'), generate_series * random(), 456, '123'
FROM numbers;

Query

select min(timestamp) as timestamp,
       case when sum(current) < 0 then -SQRT(AVG(POWER(current, 2))) else SQRT(AVG(POWER(current, 2))) end,
       456            as process_id,
       '123'          as device_id,
       index / 1000   as group_index
from (select timestamp,
             current,
             row_number() over (order by timestamp) as index
      from measurements
      where device_id = '123'
        and process_id = 456
      order by timestamp) as subquery
group by group_index
order by group_index;

db-fiddle: https://www.db-fiddle.com/f/uVTcf9Q2JDEkPf3S5hgvfB/2

Query plan visualization: http://tatiyants.com/pev/#/plans/plan_1569689658707

How to optimize the query?

Query Plan

Sort  (cost=100157.88..100158.38 rows=200 width=60) (actual time=927.340..927.402 rows=1001 loops=1)
  Sort Key: ((subquery.index / 1000))
  Sort Method: quicksort  Memory: 103kB
  ->  HashAggregate  (cost=100144.74..100150.24 rows=200 width=60) (actual time=926.828..927.036 rows=1001 loops=1)
        Group Key: (subquery.index / 1000)
        ->  Subquery Scan on subquery  (cost=0.42..77644.74 rows=1000000 width=20) (actual time=0.049..704.478 rows=1000000 loops=1)
              ->  WindowAgg  (cost=0.42..65144.74 rows=1000000 width=20) (actual time=0.046..576.692 rows=1000000 loops=1)
                    ->  Index Scan using measurements_pkey on measurements  (cost=0.42..50144.74 rows=1000000 width=12) (actual time=0.029..219.951 rows=1000000 loops=1)
                          Index Cond: ((process_id = 456) AND ((device_id)::text = '123'::text))
Planning Time: 0.378 ms
Execution Time: 927.591 ms

Best Answer

There is nothing obviously wrong with the plan, so there is no obvious large optimization to be done. You are fundamentally doing a lot of work, and it takes a lot of time.

I would probably start by taking a step back and looking at your business case. Why do you need exactly this output? Could you perhaps "need" something easier to optimize? Like, something where chunks slide past in their entirety, rather than individual rows slowly passing from chunk to chunk on their way oblivion? Or maybe where partitions are defined by predictable timestamp slices, rather than as having exactly 1000 rows per partition?

You didn't give your version, but your fiddle uses 9.5. If I use the latest version and turn on JIT (just-in-time) compilation, I get about a 15% improvement over JIT off.

If I build an index suitable for index only scans, I don't get much improvement, it is sometimes faster and sometimes slower (it also interacts with JIT). But if everything were not in already memory, it might give a huge improvement:

create index  on measurements (device_id, process_id, timestamp, current);

If you are looking for more complex solutions, you could write a process which will connect to the database and LISTENs for new INSERTs, then updates the data by sliding rows out of one partition and into the next one using incremental calculations, rather than recomputing each partition from scratch. This is similar to how moving aggregate works, except that that only works between frames within a statement, not between statements like you need. This should be very fast. For strict accuracy, this would depend on each inserted row committing in the same order of their timestamp column--that requirement is hard to guarantee in general, but might be easy for your exact situation.