PostgreSQL Array Conversion – Convert Right Side of Many-to-Many Join into Array

aggregatearraypostgresqlpostgresql-9.4

When using join on many to many relationship the result is split on multiple rows. What I'd like to do is convert the right side of a join into an array so the result is one row.

Example with 3 tables:

CREATE TABLE items (
  id    serial primary key,
  title text
);

CREATE TABLE tags (
  id    serial primary key,
  title text
);

CREATE TABLE items_tags (
  item_id int references items(id),
  tag_id  int references tags(id),
  primary key (item_id, tag_id)
);

When selecting items with their tags I can do it this way:

SELECT i.id, i.title, i.title
FROM items i
INNER JOIN items_tags it
ON it.item_id = i.id
INNER JOIN tags t
ON t.id = it.tag_id;

And the result will come up as:

(1, "item n1", "sport")
(1, "item n1", "soccer")
(2, "item n2", "adventure")
(2, "item n2", "mountain climbing")
(2, "item n2", "sport")
(2, "item n2", "nature")

What I'd like to have is this:

(1, "item n1", ["sport", "soccer"])
(2, "item n2", ["adventure", "mountain climbing", "sport" , "nature"])

Best Answer

To aggregate most rows

While querying all or most items, it is typically substantially faster to aggregate rows from the "many"-table first and join later:

SELECT id, i.title AS item_title, t.tag_array
FROM   items      i
JOIN  (  -- or LEFT JOIN ?
   SELECT it.item_id AS id, array_agg(t.title) AS tag_array
   FROM   items_tags it
   JOIN   tags       t  ON t.id = it.tag_id
   GROUP  BY it.item_id
   ) t USING (id);

Use LEFT [OUTER] JOIN in the outer query if there can be items without tags - which would be excluded with [INNER] JOIN.

Since that does not multiply rows in the join, we need no GROUP BY in the outer SELECT.

Joining before aggregation also gets out of hands with more than one 1:n table in the FROM list (not in this simple case). See:

To aggregate few rows

For a small percentage of rows, use a LATERAL join with an ARRAY constructor:

SELECT id, title AS item_title, t.tag_array
FROM   items i, LATERAL (  -- this is an implicit CROSS JOIN
   SELECT ARRAY (
      SELECT t.title
      FROM   items_tags it
      JOIN   tags       t  ON t.id = it.tag_id
      WHERE  it.item_id = i.id
      ) AS tag_array
   ) t;

Since an ARRAY constructor always produces a row (with empty array if the subquery is empty - subtle difference in the result!), LEFT JOIN LATERAL (...) ON true is not needed here. See:

Aside

You had a typo in your query. 3rd column would be t.title. I added aliases to your original (un-aggregated) query to clarify:

SELECT i.id, i.title AS item_title, t.title AS tag_title
FROM   items      i
JOIN   items_tags it ON it.item_id = i.id
JOIN   tags       t  ON t.id = it.tag_id;

"id" or "title" are typically not very distinctive and not very useful identifiers. See: