Suppose I have two tables:
Foo:
id
baz
Bar:
id
foo_id
boom
So a Foo has many Bars. I frequently find myself in situations where I need to compute an aggregate across the Bars for a given set of Foos but I also want some properties from Foo. The two most straightforward ways of doing this are ugly:
Method #1: Unnecessary Aggregate Functions
select
foo.id,
min(foo.baz) as baz,
min(bar.boom) as min_boom
from
foo
join
bar on foo.id = bar.foo_id
group by
foo.id;
Method #2: Unnecessary Group By Column
select
foo.id,
foo.baz,
min(bar.boom) as min_boom
from
foo
join
bar on foo.id = bar.foo_id
group by
foo.id,
foo.baz;
This is not so bad when there's only one additional column from Foo besides "id", but if there are many columns that need to be included then the grouping becomes much less efficient. A query like this gets around both these issues, but seems unwieldy:
select
foo.id,
foo.baz,
x.min_boom
from
foo
join
(select
foo_id,
min(boom) as min_boom
from
bar
group by
foo_id) x on x.foo_id = foo.id;
Is there a better way? Platform is Postgres if that matters.
Best Answer
If id is defined as the primary key, you can omit grouping by all the foo columns you want for the output as long as you are grouping by the id. This special case of grouping is in accordance with the current SQL standard and has also been covered in the PostgreSQL manual, starting from version 9.1:
(Emphasis added.)
So, if foo.id is the PK, this query would be valid: