I have the following table:
-- id, location_id, status, posted_year, posted_quarter
CREATE TABLE foo AS
SELECT * FROM ( VALUES
(1 ,12,'active' ,2014,3),
(2 ,12,'inactive',2014,3),
(3 ,12,'active' ,2014,3),
(4 ,12,'active' ,2014,4),
(5 ,12,'inactive',2014,4),
(6 ,13,'active' ,2015,1),
(7 ,13,'active' ,2015,1),
(8 ,13,'inactive',2015,1),
(9 ,13,'active' ,2015,2),
(10,13,'active' ,2015,2),
(11,13,'inactive',2015,3),
(12,13,'active' ,2015,4),
(13,13,'active' ,2015,4),
(14,13,'inactive',2015,4),
(15,12,'active' ,2015,1),
(16,13,'active' ,2015,1),
(17,12,'inactive',2015,1),
(18,12,'active' ,2015,2)
) AS t(id,location_id,status,posted_year,posted_quarter);
I want to recreate this table but have only one quarter per year from each location.
We might have more than one record in a year for a quarter, and in this case we need to check the status
based on:
If one status is active
for a year in selected quarter, the status
will be active
, otherwise the status
will be inactive
.
Examples:
-
location_id 12 for year 2014 and quarter 3 will have one record in the new table with status active.
-
location_id 12 for year 2015 and quarter 1 will have one record in the new table with status inactive.
How to write this query?
Best Answer
The only trick here is to create something you can group by that involves the year, and the quarter. This isn't the only way to do it, but you can do this...
Remember, three months in a quarter-year. You can also do
posted_year + posted_quater*0.25
but all the same.From there, it's pretty basic. I'll use a CTE to separate formatting from calculation (feel free to ditch it for speed).
DISTINCT ON()
GROUP BY
...bool_or()
Or, alternatively (and maybe faster),