First time using Postgres and building a query with a CASE
statement. I am trying to end up with a collection of data whereby one of the columns in the data (status
) qualifies the row based on conditions.
I have 3 tables in a Postgres DB as follows:
Table devices
:
id | serial_number | retired | last_reported_utc_at | other_data |
---|---|---|---|---|
2 | 1234 | FALSE | 2020-12-01 15:34:23 | … |
5 | 4567 | FALSE | 2020-12-01 15:34:23 | … |
6 | NULL | FALSE | NULL | … |
Table device_measurements
:
id | device_id | reading | read_utc_at |
---|---|---|---|
1 | 5 | 1.36 | 2020-12-01 15:34:23 |
2 | 2 | 2.45 | 2020-12-12 18:21:12 |
3 | 6 | 1.87 | 2020-12-03 20:29:12 |
Table devices_meters
:
id | device_id | meter_id | reading | activated_utc_at | deactivated_utc_at |
---|---|---|---|---|---|
1 | 5 | 23 | 1.36 | 2018-10-01 15:34:23 | NULL |
2 | 2 | 17 | 2.45 | 2019-06-12 18:21:12 | NULL |
3 | 6 | 17 | 2.45 | 2019-06-12 18:21:12 | NULL |
devices_meters
links the device to a meter location. The activated and deactivated dates will sequence the history of the device to meters.
The status
for each device can be one of:
active, online, offline, inactive, retired
I have been trying to generate a column status
with a CASE
statement to flag each row as one of the above. The query runs, but the resulting data is not as expected.
My CASE
statement is:
CASE
// Device retired boolean column flag set to true
WHEN devices.retired = true
THEN 'retired'
// Device with null serial number
WHEN devices.retired = false
AND devices.serial_number IS NULL
THEN 'inactive'
// Device has not reported back last 24 hrs
WHEN
devices.retired = false
AND MAX(device_measurements.reported_utc_at) < :yesterdayFilter
THEN 'offline'
// Device has reported back last 24hrs and has non zero readings for data in this period
WHEN
devices.retired = false
AND MAX(device_measurements.reported_utc_at) BETWEEN :yesterdayFilter AND :nowFilter
AND SUM(device_measurements.flow) > 0
THEN 'active'
// Device has reported back last 24hrs and has all zero readings for data in this period
WHEN
devices.retired = false
AND MAX(device_measurements.reported_utc_at) BETWEEN :yesterdayFilter AND :nowFilter
AND SUM(device_measurements.flow) = 0
THEN 'online'
END AS status,
:nowFilter
contains a date of now()
in UTC.
:yesterdayFilter
contains a date of now()->subHours(24)
.
The above does not fail but returns unpredictable results, such an offline device which has had a data report less than 24 hrs ago.
The whole query is:
select
"devices".*,
CASE
WHEN devices.retired = true THEN 'retired'
WHEN devices.retired = false
AND devices.serial_number IS NULL THEN 'inactive'
WHEN devices.retired = false
AND MAX(device_measurements.reported_utc_at) BETWEEN :yesterdayFilter AND :nowFilter
AND SUM(device_measurements.flow) > 0 THEN 'active'
WHEN devices.retired = false
AND MAX(device_measurements.reported_utc_at) BETWEEN :yesterdayFilter AND :nowFilter
AND SUM(device_measurements.flow) = 0 THEN 'online'
WHEN devices.retired = false
AND MAX(device_measurements.reported_utc_at) < :yesterdayFilter THEN 'offline'
END AS status,
MAX(device_measurements.reported_utc_at) AS last_reported_utc_at
from
"devices"
left join "devices_meters" on "devices_meters"."device_id" = "devices"."id"
left join "device_measurements" on "device_measurements"."device_id" = "devices"."id"
group by
"devices"."id"
having
(
CASE
WHEN devices.retired = true THEN 'retired'
WHEN devices.retired = false
AND devices.serial_number IS NULL THEN 'inactive'
WHEN devices.retired = false
AND MAX(device_measurements.reported_utc_at) BETWEEN :yesterdayFilter AND :nowFilter
AND SUM(device_measurements.flow) > 0 THEN 'active'
WHEN devices.retired = false
AND MAX(device_measurements.reported_utc_at) BETWEEN :yesterdayFilter AND :nowFilter
AND SUM(device_measurements.flow) = 0 THEN 'online'
WHEN devices.retired = false
AND MAX(device_measurements.reported_utc_at) < :yesterdayFilter THEN 'offline'
END
) = :status
order by
"devices"."installed_at" desc
limit
10 offset 0;
Best Answer
Based on a couple of educated guesses, this might be the query you actually want:
Major points
CASE
statements are processed from left to right. Assumingretired
isboolean NOT NULL
, repeatingdevices.retired = false
in every followingWHEN
clause is just noise. Consider the manual:You write:
So it's not a variable but a constant
now() AT TIME ZONE 'UTC'
.Consider operating with timestamp with timezone instead. See:
So a timestamp that is less than 24h earlier than
:nowFilter
?I removed the join to
devices_meters
completely, as it plays no role in the query.I moved the aggregation into subquery. That's cleaner and cheaper for the purpose. Consider:
You seem to want the sum of
flow
for the given time interval, not the sum total of all rows. I achieve that with an aggregateFILTER
clause. See:But you seem to want the latest
reported_utc_at
overall, so no filter there.I wrapped it all in a subquery and added an outer
WHERE
clause to replace the much more verboseHAVING
clause.ORDER BY
,LIMIT
, andOFFSET
after that, to get 10 rows forLIMIT 10
.