I've got a big table (~9M rows) and want to group the rows on a field containing the year. So far that's pretty easy:
// greatly simplified:
SELECT count(*), year FROM dataset GROUP BY year ORDER BY 2;
We defined some irregular time periods spanning multiple years:
<1945, 1946-1964, 1965-1974, 1975-1991, 1992-2005 and >2005
I've got no clue on how to group these results in the group by clause. I could make subquery's for every time period.
SELECT
( SELECT count(*) FROM dataset WHERE year <= 1945 AND ...... ) AS pre1945,
( ....) AS period2,
....
FROM dataset
But that feels not right and I'm wondering if it was possible to let Postgresql do it. Especially because the query is a strong simplification of the real query: it has multiple conditions, amongst them a ST_within clause spanning four tables. So choosing the subquery-approach results in a bloated query.
Is there a better way to create this result?
Best Answer
Use conditional counting:
This works because
count()
ignores null values and thecase
statement returns anull
for values outside of the range it tests for (anelse null
is implicit).With the upcoming 9.4 version you can re-write this as