Sorry my DBA skills are a bit limited, but here's an example schema (simplified from my real table)
create table test
(
id serial primary key,
person TEXT,
start_point TEXT,
end_point TEXT
);
and here's some sample data
insert into test(person, start_point, end_point)
values ('Bob', 'a', 'b'),
('Bob', 'b', 'c'),
('Bob', 'c', 'd'),
('Alice', 'a', 'b'),
('Alice', 'b', 'c'),
('Fred', 'a', 'b'),
('Fred', 'c', 'd')
;
Notice how a particular person can go from a->b, then b->c etc.
The Question
How can I SELECT
that traverses the start/end and gives me something like this as the output?
| person | place | group_id |
|--------|-------|----------|
| Bob | a | 1 |
| Bob | b | 1 |
| Bob | c | 1 |
| Bob | d | 1 |
| Alice | a | 2 |
| Alice | b | 2 |
| Alice | c | 2 |
| Fred | a | 3 |
| Fred | b | 3 |
| Fred | c | 4 |
| Fred | d | 4 |
Notice how Bob went thru a-b-c-d, Alice went thru a-b-c, but Fred didn't make it all the way thru so he gets split out into an a-b then a c-d, because he was missing the b-c. group_id
can just be from a sequence.
What I have tried so far
What I'm currently doing, is selecting the entire dataset into a CURSOR
, then doing a LOOP
over it, checking if the start_point
matches the end_point
of the previous row, and the person
matches, whilst this condition is true group them up and update a table.
When we don't find a match, assume it's for a new person and start counting again.
This is incredibly inefficient when the table grows, and I'm in a multithreaded application so I'd much rather be able to do an atomic UPDATE...SELECT
rather than opening a cursor and looping through it.
Thanks
Best Answer
It isn't exactly clear what the group_id represents but for the traversing of the points this can be handled with a recursive query like so:
Once you share more about group_id perhaps the exact sql can be produced. Use this sql fiddle to decide what group_id should represent: http://sqlfiddle.com/#!17/cc042/2