This is a problem I've been trying to work through on and off for the past month, and haven't made any progress. Essentially, I am working with a HR system that stores two major types of position for employees: Primary (P) and Acting (A). The position types aren't hugely relevant, but are important to know to understand the way the data is structured. An employee will always be in a Primary position, but may be put into an Acting position temporarily while it is vacant, or while the Primary person for that role is on leave.
The way that this is represented in the system is diagrammed below (as an example and not to scale):
P (2018-07-28 - 9999-99-99) <------------------------------------------------------------->
A (2018-09-04 - 2018-09-05) <--->
A (2018-11-12 - 2018-11-13) <--->
A (2018-11-19 - 2018-11-20) <--->
Due to a set of data I'm trying to join the position information with, I need it to be intersected to look like this, where the Acting position always takes precedence over the Primary position.
P 2018-07-28 - 2018-09-03 <---------->
A 2018-09-04 - 2018-09-05 <--->
P 2018-09-06 - 2018-11-11 <----------------------->
A 2018-11-12 - 2018-11-13 <--->
P 2018-11-14 - 2018-11-18 <----->
A 2018-11-19 - 2018-11-20
<--->
P 2018-11-21 - 9999-99-99
<----->
Below is a representation of the data directly from the database, to give an idea of what it looks like overall:
Employee Position Start Date End Date Position Type
1 30 2016-12-10 2016-12-11 P
1 30 2016-12-12 2017-12-08 P
1 20 2017-01-31 2017-02-02 A
1 20 2017-03-29 2017-03-31 A
1 20 2017-04-19 2017-04-21 A
1 20 2017-06-22 2017-06-23 A
1 20 2017-06-29 2017-06-30 A
1 20 2017-10-03 2017-10-06 A
1 30 2017-12-09 2018-07-26 P
1 20 2018-02-12 2018-03-02 A
1 20 2018-07-19 2018-07-20 A
1 30 2018-07-27 2018-07-27 P
1 30 2018-07-28 9999-99-99 P
1 20 2018-09-04 2018-09-05 A
1 20 2018-11-12 2018-11-13 A
1 20 2018-11-19 2018-11-20 A
As you can see, there are even multiple entries for the same position where the dates are contiguous. These need to be combined into a contiguous entry until they are intersected/overridden by Acting Positions. I've been scouring different sources on the internet for the past month and haven't even come close to a viable solution. The closest I've come is from SQL – Combining Overlapping Date Rows, which I've converted from Oracle but I don't have enough knowledge of T-SQL to take it all the way to what I need, original Oracle code below:
SELECT
KEY,
MIN(TD) MINTD,
MAX(TD) MAXTD
FROM
(SELECT
KEY,
TD,
SUM(C) OVER (PARTITION BY KEY ORDER BY TD) SC
FROM
(SELECT
KEY,
D,
TD,
TPD,
DECODE(SIGN(LTD-TPD),1,1,0) C
FROM
(SELECT
KEY,
D,
TRUNC(D,'MM') TD,
ADD_MONTHS(TRUNC(D,'MM'),-1) LTD,
LAG(D,1) OVER (PARTITION BY KEY ORDER BY D) PD,
LAG(TRUNC(D,'MM'),1) OVER (PARTITION BY KEY ORDER BY D) TPD
FROM
T1)))
GROUP BY
KEY,
SC
ORDER BY
KEY,
SC;
Any ideas on a SELECT query to group and intersect the dates as outlined above?
Best Answer
First, I think the link you had supplied doesn't fit for your question because it is about combining overlapped dates, and you need to generate (GAPs) new dates.
I would divide your data in two groups, those corresponding to Primary(s) and those that belong to Acting(s).
Next Acting rows are subsumed into the second Principal row:
12/12/2016 | 08/12/2017
Now let me generate a new dataset, for each Primary I'll add the corresponding Acting(s) using an OUTER JOIN, this outer join returns all Primary(s) even if they haven't subsumed Acting(s).
This is the new scenario:
I've added two partitions P1, P2 and I set up an index on them, these partitions will help me to filter information in this new scenario according to 4 types of rows:
Type 1:
Primary(s) that don't have any subsumed Acting.
Type 2:
If exists, left gaps between Acting(s) plus between first Acting and Primary start date.
Type 3:
All Acting preriods.
Type 4:
If exists, right gaps between last Acting and Primary end date.
Now the selection:
This is the final result:
db<>fiddle here