PostgreSQL Time Series Data – Resample for Hourly Averages

postgresqlpythontimescaledb

I am new to SQL trying to learn how to do read queries on time series data. Can someone give me a tip on how to resample interval time series data to hourly averages on the postgres read query?

My table is named building_data where there are a few columns named time, metric, value, kv_tags, m_tags

time is my date/time stamp column where I am trying to see if I can resample the data in the value column into hourly averages. The WHERE in the query below is to filter out for a specific device that I am interested in looking at the data. I apologize if that doesn't make sense.

For a first timer this sql query appears to work but its not incorporating some process to resample the data in hourly averages, any tips greatly appreciated.

SELECT
  "time" AS "time",
  metric AS metric,
  value,
  kv_tags,
  m_tags
FROM building_data
WHERE kv_tags->'equip_name' = '["35201"]' AND 
  m_tags IS NOT NULL
ORDER BY time desc limit 1000

Best Answer

You can use time_bucket function to make it.

Example:

SELECT
  time_bucket('1 hour', "time") AS hour,
  metric,
  avg(value)
FROM building_data
WHERE kv_tags->'equip_name' = '["35201"]' AND 
  m_tags IS NOT NULL
GROUP BY hour, metric
ORDER BY hour limit 1000

Note that I'm simplifying the query and if you want to include kv_tags and m_tags you'll need to also use aggregation functions or add them to the group by clause.

Related Solutions

PostgreSQL – SQL Hourly Data Aggregation

select
  date_trunc('hour', t - interval '1 minute') as interv_start,
  date_trunc('hour', t - interval '1 minute')  + interval '1 hours' as interv_end,
 sum(v)
  from myt 
    group by date_trunc('hour', t - interval '1 minute')
order by interv_start

see sqlfiddle

As for the index: you could try a function index on date_trunc('hour', t - interval '1 minute') but I'm not sure postgresql can use it.

Postgresql – How to speed up an ASC sort on a column that only holds an integer between 0 and 9 across multiple millions of rows

I would recommend using the same syntax for all of your WHERE clauses (your index would build for negative values of bar, but you never SELECT those)

CREATE INDEX has an implicit order of ASC, so it unlikely that not specifying the indexing order alone is the source of your problem (as you stated the DESC sort is faster).

Naively, I'd also recommend that you include the id column in your index, but this may negatively affect insert/index rebuild performance, as well as overall disk and memory usage.

CREATE INDEX foo_bar_idx
  ON foo(bar ASC, id ASC) 
  WHERE bar > 0

You could try converting all zeros to NULL (and setting NULL as the default), then changing your WHERE clauses to WHERE bar IS NOT NULL. So your create table code would look like

CREATE TABLE foo (
  id TEXT NOT NULL,
  bar INTEGER NULL DEFAULT NULL,
  PRIMARY KEY (id)
);

Once you have NULLs established, you could alternatively play around with the NULLS FIRST / LAST parameters in your index, rather than using a WHERE clause, but that is unlikely to improve performance.

Best Answer

Related Solutions

PostgreSQL – SQL Hourly Data Aggregation

Postgresql – How to speed up an ASC sort on a column that only holds an integer between 0 and 9 across multiple millions of rows

Related Question