I've got about a billion rows of data in a table with a name and an integer in the range 1-288. For a given name, every int is unique, and not every possible integer in the range is present–so there are gaps.
This query generates an example case:
--what I have:
SELECT *
FROM ( VALUES ('foo', 2),
('foo', 3),
('foo', 4),
('foo', 10),
('foo', 11),
('foo', 13),
('bar', 1),
('bar', 2),
('bar', 3)
) AS baz ("name", "int")
I'd like to generate a lookup table with a row for each name and sequence of contiguous integers. Each such row would contain:
name — the value of the name column
start — the first integer in the contiguous sequence
end — the final value in the contiguous sequence
span — end – start + 1
This query generates example output for the above example:
--what I need:
SELECT *
FROM ( VALUES ('foo', 2, 4, 3),
('foo', 10, 11, 2),
('foo', 13, 13, 1),
('bar', 1, 3, 3)
) AS contiguous_ranges ("name", "start", "end", span)
Because I have so many rows, more efficient is better. That said, I only have to run this query once, so it isn't an absolute requirement.
Thanks in advance!
Edit:
I should add that PL/pgSQL solutions are welcome (please explain any Fancy Tricks–I'm still new to PL/pgSQL).
Best Answer
How about using
with recursive
test view:
query:
result:
I'd be interested to know how that performs on your billion row table.