I have a table including these columns:
Sample data:
concurrent_users concurrent_timestamp concurrent_date
3 2020-06-24 07:20:00 UTC 2020-06-24
7 2020-06-24 08:20:00 UTC 2020-06-24
8 2020-07-05 06:20:00 UTC 2020-07-05
2 2020-07-05 03:20:00 UTC 2020-07-05
I want to find/list the date and timestamp where the concurrent_users
is highest for each date
.
Expected output:
concurrent_users concurrent_timestamp concurrent_date
7 2020-06-24 08:20:00 UTC 2020-06-24
8 2020-07-05 06:20:00 UTC 2020-07-05
I tried following approach/query:
Error 1
select concurrent_timestamp, max(concurrent_users)
from sample_data
group by concurrent_date;
But this gives an error as the concurrent_timestamp
column is not included in the group by
clause.
When I put concurrent_timestamp
in group by
, I get …
Error 2
select concurrent_timestamp, max(concurrent_users)
from sample_data
group by concurrent_timestamp ;
It does not give the desired output instead just lists more than a single record for each date
.
How to solve this?
Best Answer
In PostgreSQL you can simply use
DISTINCT ON
:We don't need the redundant column
concurrent_date
for this at all. Castingconcurrent_timestamp
on the fly is very cheap - overall cheaper than storing the functionally dependent value redundantly.db<>fiddle here
See:
DISTINCT ON
is typically fastest for few rows per group. Depending on table definition and data distribution, there may be (much) faster solutions. See: