Postgresql – Consolidate multiple rows into single row

aggregateaggregate-filterpivotpostgresql

In PostgreSQL 9.5, I have a table named reports:

CREATE TABLE public.reports (
  id BIGSERIAL PRIMARY KEY,
  id_station character(11) NOT NULL,
  date date NOT NULL,
  element character(4) NOT NULL,
  value smallint NOT NULL
);

For each station (id_station column) and each day (date column), I may have multiple value types (element column) : TMIN, TMAX, TAVG (sometimes these values don't exist: I may have only a TMIN and a TMAX for a given day).

Here is a (fake) sample:

22;"FR069029001";"1925-01-01";"TMAX";130
23;"FR069029001";"1925-01-01";"TMIN";-25
24;"FR069029001";"1925-01-01";"TAVG";0

I would like to consolidate these values in one single row for each station and each day using this table:

CREATE TABLE public.reports_con (
id SERIAL PRIMARY KEY,
id_station character(11) NOT NULL,
date date NOT NULL,
tmin smallint,
tmax smallint,
tavg smallint
);

I would like to achieve this result:

454;"FR069029001";"1925-01-01";-25;130;0

How to consolidate the data this way in PostgreSQL? With CREATE TABLE AS?

I know it has to be a recursive query like this (human language) :

For each day:
    For each station:
        Find values for TMIN, TMAX, TAVG
Insert the results in reports_con in a single row with day and station

I begin learning SQL and would like to achieve this within PostgreSQL (not with Python or a programming language). Can you please help me?

Best Answer

Since the target table obviously exists, CREATE TABLE AS is unrelated to the solution. And you certainly don't need a recursive query, either.

For just three columns you might use conditional aggregates:

INSERT INTO reports_con (id_station, date, tmin, tmax, tavg)
SELECT id_station, date
     , min(value) FILTER (WHERE element = 'TMIN') AS tmin
     , min(value) FILTER (WHERE element = 'TMAX') AS tmax
     , min(value) FILTER (WHERE element = 'TAVG') AS tavg
FROM   reports
GROUP  BY id_station, date
ORDER  BY id_station, date;

The aggregate FILTER clause requires Postgres 9.4. Details and alternatives for older versions:

For more columns of for lots of rows (and you need better performance), consider an actual crosstab() query. The special difficulty is that your key consists of two columns, but you need one for crosstab(). Generate a surrogate key with row_number() like in these related answers: