This is quite similar to this question about continuous ranges, and this question about grouping by sequential numbers, but differs in that the sequences are not numeric. Given the following relationships as key pairs
a -- b -- c e -- f -- g
| /
| /
d
This is the table with example data (also on SQLFiddle):
CREATE TABLE relationships (
name varchar(1),
related varchar(1)
);
INSERT INTO relationships (name, related) VALUES
('a', 'a'),
('a', 'b'),
('b', 'b'),
('b', 'a'),
('b', 'c'),
('b', 'd'),
('c', 'c'),
('c', 'b'),
('c', 'd'),
('d', 'd'),
('d', 'c'),
('d', 'b'),
('e', 'e'),
('e', 'f'),
('f', 'f'),
('f', 'e'),
('f', 'g'),
('g', 'g');
What is the most efficient way to produce an output that looks like this:
| group | members |
------------------------|
| 1 | {a, b, c, d} |
| 2 | {e, f, g} |
or this:
| name | group |
-----------------
| a | 1 |
| b | 1 |
| c | 1 |
| d | 1 |
| e | 2 |
| f | 2 |
| g | 2 |
I have thought about doing this operation outside of Postgres, but it seems like there must be a way to achieve this result with either a window function or PL/pgSQL.
Best Answer
Ugly, but it works!
First, define
array_agg_mult
from this questionand then run the query
Which produces the following:
SQLFiddle here - http://sqlfiddle.com/#!15/c8a5b/20. I'm a novice with this kind of query, so please let me know if there is a more efficient way to do this!