We are migrating to a new record management system and I'm weeding through the issues with the data migration. Source data and destination data are all in Postgresql
In the source database we have multiple rows for the same vendor for each authorization code for that vendor and the start and end date of that authorization.
+----------+---------------+------------+-----------+
| Vendor | Authorization | Start Date | End Date |
+----------+---------------+------------+-----------+
| Vendor 1 | ##### | 1/1/2017 | 2/28/2017 |
| Vendor 1 | ##### | 3/1/2017 | 4/15/2017 |
| Vendor 2 | ##### | 4/16/2017 | 5/31/2017 |
| Vendor 2 | ##### | 6/1/2017 | 7/12/2017 |
| Vendor 1 | ##### | 7/13/2017 | 9/30/2017 |
+----------+---------------+------------+-----------+
In the destination database we have records for the active vendor relationship at a given time with a start date and an end date. (Auth # start and end dates are tracked separately) After migration the example listed previously should look like this.
+----------+------------+-----------+
| Vendor | Start Date | End Date |
+----------+------------+-----------+
| Vendor 1 | 1/1/2017 | 4/15/2017 |
| Vendor 2 | 4/16/2017 | 7/12/2017 |
| Vendor 1 | 7/13/2017 | 9/30/2017 |
+----------+------------+-----------+
I think I should be able to accomplish this joining the table back on it self n-1 number of times where n is the largest number of consecutive authorization numbers for the same vendor which is probably around 10, which is doable but definitely not preferable.
Is there a more common method to group data by consecutive start and end dates?
Best Answer
You could use a GROUPING AND WINDOW solution in this way:
db<>fiddle here