PostgreSQL – How to Select the First or Last Entry for a Specific Day

greatest-n-per-grouppostgresql-9.3

I have a table "AttendanceRecords" with entries like this

EntryID UUID PRIMARY KEY,
StafID UUID REFERENCES staffmembers(ID) NOT NULL,
ArrivalTime TIMESTAMP NOT NULL,
DepartureTime TIMESTAMP

For one function I want to find the first and last entry for a specified staff member on a specified date. To Clarify, First would be as per

SELECT MIN(ArrivalTime) 
FROM AttendanceREcords 
WHERE StaffID = 'xxxxx';

But I'm looking for how to compose a query that will do

SELECT EntryID 
FROM AttendanceREcords 
WHERE StaffID = 'xxxxx' 
AND ArrivalTime = MIN(ArrivalTime);

On a related report I would like to find all the entries where the arrival-times are on a specified date, eg sorted by StaffID

What would the queries look like for these (Using Postgres 9.3)

Best Answer

You can't use an aggregate function in a condition, you need to use a sub-select

SELECT EntryID 
FROM AttendanceREcords 
WHERE StaffID = 'xxxxx' 
AND ArrivalTime = (select min(ArrivalTime) 
                   from AttendanceREcords
                   where StaffID = 'xxxxx');

But this can be done more efficiently by using a LIMIT clause:

SELECT EntryID 
FROM AttendanceREcords 
WHERE StaffID = 'xxxxx' 
ORDER BY arrivalTime 
LIMIT 1

There is a difference between the two statements: if more than one row has the same minimum arrival time, the first one will return all of them, the second one only one row.

Another alternative that is usually more efficient that a sub-select is using a window function:

SELECT EntryID
FROM (
  SELECT EntryID, 
         dense_rank() over (order by arrivalTime) as rnk
  FROM AttendanceREcords 
  WHERE StaffID = 'xxxxx' 
) t
where rnk = 1;

By changing the order by arrivalTime you can select the first or the last.

If you want the first and last in a single query you can do something like this:

SELECT EntryID,
       arrivalTime
FROM (
  SELECT EntryID, 
         min(arrivalTime) as min_time,
         min(arrivalTime) as max_time,
  FROM AttendanceREcords 
  WHERE StaffID = 'xxxxx' 
) t
where arrival_time = min_time
   or arrival_time = max_time;