Postgresql – Functional dependency not detected for some columns

group bypostgresql

Here's a simple example schema:

ERD

CREATE TABLE parents (
  parent_id INT,
  type text null,
  PRIMARY KEY (parent_id)
);

CREATE TABLE children (
  child_id INT,
  parent_id INT,
  name text,
  PRIMARY KEY (child_id)
);

INSERT INTO parents (parent_id, type) VALUES (1, null);
INSERT INTO children (child_id, parent_id, name) VALUES (1, 1, 'foo');

This query works:

SELECT child_id
FROM children
JOIN parents USING (parent_id)
GROUP BY child_id, parent_id

Fiddle

If I change that to SELECT child_id, name it still works, as it should because name is functionally dependent on child_id and child_id is in the GROUP BY clause.

Now if I change it to SELECT child_id, type I think it should still work, because type is functionally dependent on parent_id and parent_id is in the GROUP BY clause. But instead I get an error message:

column "parents.type" must appear in the GROUP BY clause or be used in an aggregate function

What's the difference? Does it have anything to do with the JOIN?

Best Answer

I found out what is the difference.

When I write

SELECT child_id, type
FROM children
JOIN parents USING (parent_id)
GROUP BY child_id, parent_id

Postgres seems to assume I mean GROUP BY child_id, children.parent_id and even though those two are guaranteed to be identical by the JOIN, they don't seem to count as functionally dependent. If instead I write

SELECT child_id, type
FROM children
JOIN parents USING (parent_id)
GROUP BY child_id, parents.parent_id

Postgres recognizes the functional dependency and allows type to be in the SELECT expression.