PostgreSQL Group By – Why Select All Fields When Grouping by Primary Key but Not Another Column

group bypostgresqlselect

How is this a valid statement (where id is the primary key of the table):

select * from table group by id ;

and this is not:

select * from table group by name ;

ERROR: column "pgluser.id" must appear in the GROUP BY clause or be used in an aggregate function

Fiddle.

The question is why is the first a legal query, ie why grouping by primary key is valid?

Best Answer

id is a primary key.
As far as I remember, this is actually a legal query according to ANSI/ISO SQL.
Grouping by primary key results in a single record in each group which is logically the same as not grouping at all / grouping by all columns, therefore we can select all other columns.

create table t (id int primary key,c1 int,c2 int)
insert into t (id,c1,c2) values (1,2,3),(4,5,6);
select * from t group by id;

+----+----+----+
| id | c1 | c2 |
+----+----+----+
| 1  | 2  | 3  |
+----+----+----+
| 4  | 5  | 6  |
+----+----+----+

Reference given by @a_horse_with_no_name

https://www.postgresql.org/docs/current/static/sql-select.html#SQL-GROUPBY

When GROUP BY is present, or any aggregate functions are present, it is not valid for the SELECT list expressions to refer to ungrouped columns except within aggregate functions or when the ungrouped column is functionally dependent on the grouped columns, since there would otherwise be more than one possible value to return for an ungrouped column. A functional dependency exists if the grouped columns (or a subset thereof) are the primary key of the table containing the ungrouped column.

While logically we would expect UNIQUE NOT NULL to follow the same behaviour, it applies only for PK (as described in the documentation)

create table t (id int unique not null,c1 int,c2 int);
insert into t (id,c1,c2) values (1,2,3),(4,5,6);
select * from t group by id;

[Code: 0, SQL State: 42803] ERROR: column "t.c1" must appear in the GROUP BY clause or be used in an aggregate function

Related Solutions

Column ‘Comments.Text’ is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause

Your question isn't really clear, but I think you want something like this:

select group_id, 
       item_id, 
       comment
from (
   select group_id,
          item_id, 
          comment, 
          row_number() over (partition by group_id order by item_id) as rn
   from the_unknown_table
) t
where rn = 1;

(you didn't state your DBMS, so this is ANSI SQL)

SQL Fiddle demo: http://sqlfiddle.com/#!12/a8471/1

MySQL – Select Records in One Table but Not in Another Without Matching Primary Key

MySQL, in fact, all SQL products really barf at

c.id=NULL

It causes confusion. You really mean

c.id IS NULL

You need to change it to

SELECT count(*) FROM list l LEFT OUTER JOIN cardinal c ON l.sku=c.sku where c.id is null;

Give it a Try !!!

ABOUT YOUR QUESTION

Your first query

SELECT count(*) FROM list l LEFT JOIN cardinal c ON c.id=null;

resembles a natural join where nothing matches on the cardinal table, so the count 2677513 is correct.

Your second query

SELECT count(*) FROM list l LEFT OUTER JOIN cardinal c ON l.sku=c.sku where c.id=null;

is little more explicit, but the c.id=null makes it fail, thus getting 0 as a count.

Best Answer

Related Solutions

Column ‘Comments.Text’ is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause

MySQL – Select Records in One Table but Not in Another Without Matching Primary Key

ABOUT YOUR QUESTION

Related Question