After some processing this boiled down to:
While your predicate d."SettlementPointName" = 'John'
is filtering a single value for "SettlementPointName" anyway, simplify to:
SELECT count( d."SettlementPointPrice" < 10.5 OR NULL) AS da_00_10
, count(d."SettlementPointPrice" >= 10.5 AND d."SettlementPointPrice" < 20.5 OR NULL) AS da_11_20
, count(d."SettlementPointPrice" >= 20.5 AND d."SettlementPointPrice" < 30.5 OR NULL) AS da_21_30
FROM public.da d
JOIN public.rt_aggregate r USING ("DeliveryDate", "SettlementPointName")
WHERE d."SettlementPointName" = 'John'
AND d."DeliveryDate" >= '2015-02-01'
AND d."DeliveryDate" <= '2015-02-20'
AND r."DeliveryHour" = 14
AND date_part('hour', d."DeliveryHour") = r."DeliveryHour";
About the counting technique:
Or better, yet, use the new aggregate filter technique in pg 9.4:
SELECT d."SettlementPointName"
, count(*) FILTER (WHERE d."SettlementPointPrice" < 10.5) AS da_00_10
, count(*) FILTER (WHERE d."SettlementPointPrice" >= 10.5
AND d."SettlementPointPrice" < 20.5) AS da_11_20
, count(*) FILTER (WHERE d."SettlementPointPrice" >= 20.5
AND d."SettlementPointPrice" < 30.5) AS da_21_30
FROM public.da d
JOIN public.rt_aggregate r USING ("DeliveryDate", "SettlementPointName")
WHERE d."DeliveryDate" >= '2015-02-01'
AND d."DeliveryDate" <= '2015-02-20'
AND r."DeliveryHour" = 14
AND date_part('hour', d."DeliveryHour") = r."DeliveryHour"
GROUP BY 1;
This time, selecting all names and returning one row per name like you asked in the comment.
Details for FILTER
:
Order of source rows irrelevant, source is array column
Your updated requirements are much simpler for two reasons:
- We don't need to establish any order among source rows.
- And the array already establishes order among array elements.
Simple table:
CREATE TABLE vals (j int[]);
Simple query:
SELECT ARRAY(SELECT elem FROM vals, unnest(j) elem) AS arr;
Order of source rows relevant, source are integer columns
Not as simple. Based on this table:
CREATE TABLE vals (i int, j int);
There are various ways to preserve order. I think this should be among the fastest - with an ARRAY constructor:
SELECT ARRAY(
SELECT t.elem
FROM (SELECT i, j FROM vals ORDER BY i, j) v
, LATERAL (VALUES (v.i), (v.j)) t(elem)
) AS arr;
My first draft had ORDER BY i, j
after the LATERAL
join, which would not guarantee the order of columns. I pulled the ORDER BY
into a subselect to produce the correct order now. This works, but it's not documented. To be absolutely sure, you have to add an explicit ORDER BY
later:
SELECT ARRAY (
SELECT elem
FROM vals v, LATERAL (VALUES (1, v.i), (2, v.j)) t(ordr, elem)
ORDER BY v.i, v.j, ordr
) AS arr;
About guaranteed order:
About the VALUES
expression in a LATERAL
join:
Another way with - this time with array_agg()
:
SELECT array_agg(elem) AS arr
FROM (
SELECT i, j, 1 AS ordr, i AS elem FROM vals
UNION ALL
SELECT i, j, 2 AS ordr, j AS elem FROM vals
ORDER BY i, j, ordr
) sub;
The ARRAY constructor is faster, array_agg()
is easier to integrate with multiple result columns.
SQL Fiddle.
The query in your test case would actually work in Postgres 9.5 (currently beta, to be released in the course of 2015). Quoting the release notes for pg 9.5:
Allow array_agg()
and ARRAY()
to take arrays as inputs (Ali Akbar, Tom
Lane)
However, you get a 2-dimenstional array instead, not your desired result. You can easily implement the same (but slower) aggregate function in older versions:
Best Answer
sum
was an obvious choice :