I'm working with PostgreSQL 9.4.
I have a table that contains the following entries:
id | postcode | date_created
---+----------+-----------------
14 | al2 2qp | 2015-09-23 14:46:57
14 | al2 2qp | 2015-09-23 14:51:07
14 | sp2 8ag | 2015-09-23 14:56:11
14 | se4 | 2015-09-23 16:12:05
17 | e2 | 2015-09-23 16:15:35
17 | fk20 8ru | 2015-09-23 16:28:35
17 | fk20 8ru | 2015-09-23 16:35:51
17 | se2 | 2015-09-23 16:36:17
17 | fk20 8ru | 2015-09-23 16:36:22
17 | fk20 8ru | 2015-09-23 16:37:04
17 | se1 | 2015-09-23 16:37:11
17 | fk20 8ru | 2015-09-23 16:37:15
17 | se1 8ga | 2015-09-24 09:52:46
17 | se1 | 2015-09-24 10:01:19
17 | hp27 9rz | 2015-09-24 10:05:27
17 | hp27 9rz | 2015-09-24 10:05:29
17 | se1 | 2015-09-24 10:19:46
14 | tn21 8qb | 2015-09-24 14:49:05
14 | tn21 8qb | 2015-09-24 15:42:45
14 | tn21 8qb | 2015-09-24 17:38:06
14 | n4 1ny | 2015-09-25 14:49:10
What I want to achieve is a query that returns the 5 most recent unique postcode records for each id:
id | postcode
---+---------
14 | n4 1ny
14 | tn21 8qb
14 | se4
14 | sp2 8ag
14 | al2 2qp
17 | se1
17 | hp27 9rz
17 | se1 8ga
17 | fk20 8ru
17 | se2
What would be the best way of achieving this? I've been playing around with subqueries but keep hitting walls when it comes to ordering them whilst doing a DISTINCT
and GROUP BY
.
Best Answer
There are probably many ways to do this. The first that comes to mind is to use window functions:
Test at SQLfiddle.
If there are ties, say the 5th, 6th and 7th
postcode
for anid
have the samedate_created
, only one of them (choice will be arbitrary) will be in the results. If you want all the tied postcodes in those cases, useRANK()
instead ofROW_NUMBER()
.Another option is to use the
LATERAL
syntax. I'm not sure which will be more efficient, it wil probably depend on the values distribution of the two columns (id
andpostcode
), i.e. how many distinct ids in the whole table, how many distinct postcodes per id and how many rows per (id, postcode) combinations.Adding an index on
(id, postcode, date_created)
would be a good idea, too - or on(id, postcode, date_created DESC)
.