PostgreSQL – Select Arbitrary Element

postgresqlselect

I have a table that has path1, path2, and sha1 value. For any values of path2 and sha1, there can be multiple values of path1. I just want one of those paths. I don't really care which one.

I'm thinking I can do a group by for path2 and sha1. Now I just need to select one of the values of path1. I suppose I could select the minimum value of path1 but that would be doing extra work that isn't really needed.

Google tells me that Microsoft has "FIRST" but I don't see that in the postgres pages. Plus… I'd like to stick with normal SQL if possible.

Best Answer

There are a bunch of ways you can do this, one of them is with DISTINCT ON as @Ypercube has suggested,

SELECT DISTINCT ON (path2, sha1) path2, sha1, path1
FROM table_name
ORDER BY path2, sha1;

You can also use an ordered-set aggregate which should generally be slower.

SELECT percentile_disc(0) WITHIN GROUP (ORDER BY path1) AS path1, path2, sha1
FROM table_name
GROUP BY path2, sha1;