Mysql – Find timestamp for highest number of concurrent users by date

greatest-n-per-groupMySQLpostgresql

I have a table including these columns:

Sample data:

concurrent_users  concurrent_timestamp       concurrent_date
3                 2020-06-24 07:20:00 UTC    2020-06-24
7                 2020-06-24 08:20:00 UTC    2020-06-24
8                 2020-07-05 06:20:00 UTC    2020-07-05
2                 2020-07-05 03:20:00 UTC    2020-07-05

I want to find/list the date and timestamp where the concurrent_users is highest for each date.

Expected output:

concurrent_users  concurrent_timestamp       concurrent_date
7                 2020-06-24 08:20:00 UTC    2020-06-24
8                 2020-07-05 06:20:00 UTC    2020-07-05

I tried following approach/query:

Error 1

select concurrent_timestamp, max(concurrent_users)
from sample_data
group by concurrent_date;

But this gives an error as the concurrent_timestamp column is not included in the group by clause.

When I put concurrent_timestamp in group by, I get …

Error 2

select concurrent_timestamp, max(concurrent_users)
from sample_data
group by concurrent_timestamp ;

It does not give the desired output instead just lists more than a single record for each date.

How to solve this?

Best Answer

In PostgreSQL you can simply use DISTINCT ON:

SELECT DISTINCT ON (concurrent_date) *
FROM   sample_data
ORDER  BY concurrent_date, concurrent_users DESC;

We don't need the redundant column concurrent_date for this at all. Casting concurrent_timestamp on the fly is very cheap - overall cheaper than storing the functionally dependent value redundantly.

SELECT DISTINCT ON (concurrent_timestamp::date) *
FROM   sample_data
ORDER  BY concurrent_timestamp::date, concurrent_users DESC;

db<>fiddle here

See:

DISTINCT ON is typically fastest for few rows per group. Depending on table definition and data distribution, there may be (much) faster solutions. See: