PostgreSQL – Column Must Appear in GROUP BY Clause or Be Used in Aggregate Function

postgresqlpostgresql-10postgresql-9.3

I have a simple table with columns col1, col2, col3. All not nullable.

I want to delete all rows where the tuple (col1, col2) has several entries. Background: a unique constraint for (col1, col2) should be added.

drop table mytable;

create table mytable (
    col1 integer not null,
    col2 integer not null,
    col3 integer not null);

-- rows to delete
insert into mytable values (1, 1, 1);
insert into mytable values (1, 1, 2);

-- rows to keep
insert into mytable values (2, 2, 1);
insert into mytable values (2, 3, 2);



delete from mytable where 
(col1, col2) in  (
    select col1, col2 from mytable  
    group by (col1, col2) having  count(distinct col3) >1) ;

select * from mytable;

Above works on PostgreSQL 10 but fails on older versions.

Older versions tell me this error message:

ERROR: column "mytable.col1" must appear in the GROUP BY clause or be used in an aggregate function

How to get this working on PG 9.3?

Best Answer

You just need to remove the parentheses around the columns in group by (col1, col2). This works in version 9.4 and previous as well:

delete from mytable  
where (col1, col2) in  (
    select col1, col2 from mytable  
    group by col1, col2                   -- <-- changed
    having  count(distinct col3) >1) ;

The reason that it fails (I think) is that while (col1, col2) is equivalent to row(col1, col2), there was some inconsistency in how it was handled in the various clauses which was fixed in 9.5. In previous versions, you could use a more complex construction in WHERE: WHERE (SELECT (col1, col2)) IN .... So this should work in 9.3 as well:

delete from mytable
where (select (col1, col2)) in  (
    select (col1, col2) from mytable  
    group by (col1, col2) having  count(distinct col3) >1) ;

Related Solutions

Postgresql – pg_restore: [custom archiver] could not read from input file: end of file

I would suggest you try to do something along the line of:

SELECT count(*) FROM mytable WHERE my_primary_key < value ;

... and find at which point it fails. [This may be tedious, but you can just cut in half repeatedly, until you find the value that first fails.]

You will most probably need to alter the different settings that affect index usage, because you actually want to force the database to use the index, even if it actually must scan 99% of the table. You want it not to scan the page that booms.

If you can get SELECT to give you most of the data, you can then do something such as:

CREATE TABLE my_table_2 AS 
SELECT * FROM my_table WHERE my_primary_key < value;

and later on:

ALTER TABLE my_table RENAME TO my_table_old ;
ALTER TABLE my_table_2 RENAME TO my_table ;

... and you'll have all the data that could be retrieved. I wouldn't drop the old table, in case someone finds later on a better method of retrieving the missing info.

Best of luck.

PostgreSQL – Fixing ‘Must Appear in the GROUP BY Clause or Be Used in an Aggregate Function’ Error

Have a look at Postgres docs about GROUP BY:

When GROUP BY is present, or any aggregate functions are present, it is not valid for the SELECT list expressions to refer to ungrouped columns except within aggregate functions or when the ungrouped column is functionally dependent on the grouped columns, since there would otherwise be more than one possible value to return for an ungrouped column. A functional dependency exists if the grouped columns (or a subset thereof) are the primary key of the table containing the ungrouped column.

According to this you cannot reference a field on the SELECT statement if it doesn't appear on the GROUP BY clause or without using an aggregated function.

In your example, you're referencing p_search.id but it doesn't appear on the GROUP BY clause. You could try to change it by kia.image_id

Best Answer

Related Solutions

Postgresql – pg_restore: [custom archiver] could not read from input file: end of file

PostgreSQL – Fixing ‘Must Appear in the GROUP BY Clause or Be Used in an Aggregate Function’ Error

Related Question