Is it possible to create some sort of chain of grouping in Postgres? Let's say I have the following chart:
CREATE TABLE foo AS
SELECT row_number() OVER () AS id, *
FROM ( VALUES
( 'X', 'D', 'G', 'P' ),
( 'F', 'D', 'L', 'M' ),
( 'X', 'N', 'R', 'S' ),
( 'Y', 'I', 'W', NULL ),
( 'U', 'Z', 'E', NULL )
) AS f(a,b,c,d);
id | a | b | c | d
------------------
1 | X | D | G | P
2 | F | D | L | M
3 | X | N | R | S
4 | Y | I | W |
5 | U | Z | E |
I want to somehow craft a GROUP BY
that yields three groups:
1
,2
and3
together1
and2
because of a commonD
in theb
column1
and3
because of a commonX
in thea
column
4
alone (no common values in any of the columns; nulls shouldn't match)5
alone (no common values in any of the columns; nulls shouldn't match)
I'm currently using Postgres 9.5, but we'll upgrade eventually to 9.6, so if there's anything in there that'll help me, I'm open to hearing it.
In other words, I'm looking for something like (let's say I used array_agg(DISTINCT a)
, etc. to keep display simpler):
ids | as | bs | cs | ds
-----------------------------------------------------------------------
{1, 2, 3} | {'X', 'F'} | {'D', 'N'} | {'G', 'L', 'R'} | {'P', 'M', 'S'}
{4} | {'Y'} | {'I'} | {'W'} | {NULL}
{5} | {'U'} | {'Z'} | {'E'} | {NULL}
(I'm not exactly sure how the nulls would display so don't get too hung up on that; the important point is that they shouldn't match each other.)
When I use GROUP BY CUBE (a, b, c, d)
, I get way more than three results…ditto GROUP BY ROLLUP
and GROUP BY GROUPING SETS
.
Is there an elegant way in Postgres? I can imagine how you would do it in Ruby via Active Record (loop through every record, group it with previous grouped sets that match), but I'd like to keep this in Postgres if possible.
Best Answer
Another recursive solution that:
Initial data (copied from Jack Douglas' solution):
The query:
Cleanup: