Postgresql – How to group rows while filtering out values per each group

greatest-n-per-groupgroup bypostgresql

Imagine you have the following setup:

CREATE TABLE namechanges (
  id text,
  new_name text,
  change_date timestamp
);

INSERT INTO namechanges VALUES
  ('x', 'alice', '2020-03-01'),
  ('y', 'Bob T.', '2020-03-03'),
  ('x', 'Alice', '2020-03-05'),
  ('z', 'charlie', '2020-03-07'),
  ('x', 'Alice C.', '2020-03-09'),
  ('z', 'Charlie Z.', '2020-03-11')

How would look like a query that would retrieve just the current name for each id and return the following?

| id  | max        |
| --- | ---------- |
| z   | Charlie Z. |
| y   | Bob T.     |
| x   | Alice C.   |

Here's the example above on DB Fiddle if you want to play with it: https://www.db-fiddle.com/f/ugNcXhRyb44KTpqjFXmKDW/0

Best Answer

A much better way of doing what you want to do is to use the ROW_NUMBER() Analytic function as follows (see fiddle here). It avoids the "workaround" of concatenating strings and these functions are great - see below.

SELECT rn, id, new_name, change_date 
FROM  
(
  SELECT 
    ROW_NUMBER() OVER (PARTITION BY id  ORDER BY id, change_date DESC) as rn,
    id, new_name, change_date
  FROM namechanges
  ORDER  BY id, change_date DESC
) AS tab
WHERE rn = 1
ORDER BY id, change_date;

Result:

rn  id  new_name    change_date
1   x   Alice C.    2020-03-09 00:00:00
1   y   Bob T.      2020-03-03 00:00:00
1   z   Charlie Z.  2020-03-11 00:00:00

Analytic (aka Window) functions are very powerful and will reward, a gazillion fold, time spent learning them! I've left the steps in the logic that I followed in the fiddle, so that (plus the tutorial link above) should give you a good start.