Sql-server – Create Date Spans from Contiguous and Non-Contiguous Date Spans

datesql servert-sql

I am trying to create a query that will return a series of dates based off of the 'BEG_DT' and 'END_DT'. The results should show a series dates based off of the 'BEG_DT' and with a span between each date equal to the 'PLAN_LENGTH' value. An Example would be a 'BEG_DT' of 2020-07-01, 'END_DT' of 2100-06-30 and 'PLAN_LENGTH' of 12, so it would return the dates '2020-07-01' and '2021-07-01'. The last Date it should generate is 2021-07-01 as the next date of '2022-07-01' is more than 12 month in the future.

The logic requirements would look like this:

  1. Start date should be equal to the BEG_DT
  2. Each date span should be equal to the PLAN_LENGTH months apart
  3. Do not exceed END_DT
  4. Do not return a date that exceeds the PLAN_LENGTH months from the current run date.
  5. Each GROUP_ID and PLAN_ID can have its own start and end date.

This is the logic that I have tried use. I have not been able to get it to recognize that different group and plan combinations can have different starting month and days.

DECLARE @RUN_DATE DATETIME = CURRENT_TIMESTAMP

    Create table #DATE_SPANS
    (
        GROUP_ID VARCHAR(8)
    ,   PLAN_ID VARCHAR(8)
    ,   BEG_DT DATE
    ,   END_DT DATE
    ,   PLAN_LENGTH INT
    )

    Insert Into #DATE_SPANS
    (GROUP_ID, PLAN_ID, BEG_DT, END_DT, PLAN_LENGTH)
    values 
        ('82551399','AMT00001','2020-01-01','2020-12-31','12')
    ,   ('82551399','AMT00002','2020-01-01','2100-12-31','12')
    ,   ('82551399','AMT00003','2020-01-01','2021-12-31','12')
    ,   ('75491773','AMT00004','2020-01-01','2021-06-30','18')
    ,   ('32198498','AMD00001','2020-10-01','2021-09-30','12')
    ,   ('32198498','AMD00001','2021-10-01','2022-09-30','12')
    ,   ('32198498','AMD00002','2020-10-01','2022-09-30','12')
    ,   ('32198498','AMD00003','2020-10-01','2100-09-30','12')


declare @BEGIN_DT_1 DATETIME = (SELECT MIN(DISTINCT(BEG_DT)) FROM #DATE_SPANS)
declare @END_DT_1 DATETIME = (SELECT MAX(DISTINCT(CASE WHEN END_DT = '9999-12-31' then @RUN_DATE ELSE END_DT END)) FROM #DATE_SPANS )

--SELECT @END_DT_1

;with dates ([Date]) as (
    Select convert(date, @BEGIN_DT_1) as [Date] -- Put the start date here

    union all 

    Select dateadd(YEAR, 1, [Date])
    from dates
    where [Date] <= @END_DT_1 -- Put the end date here 
) 

select t.GROUP_ID
    , t.PLAN_ID
    , [Date] AS [DATE]
    INTO #TEMP_PLAN_YEARS
    from dates d
    join #DATE_SPANS t on d.Date >= t.BEG_DT
        and d.Date <= DATEADD(MM,0,t.END_DT)
option (maxrecursion 32767)

select distinct GROUP_ID, PLAN_ID,  [DATE]
into #TEMP_PLAN_YEARS_FINAL
from #TEMP_PLAN_YEARS
where [DATE] >= '2021-01-01'
order by [DATE],  PLAN_ID

select * from #TEMP_PLAN_YEARS_FINAL
where DATE <= DATEADD(YEAR, 1, GETDATE())

————TEST DATA——–

CREATE TABLE #DATE_SPANS
(
    GROUP_ID VARCHAR(8)
,   PLAN_ID VARCHAR(8)
,   BEG_DT DATE
,   END_DT DATE
,   PLAN_LENGTH INT
)

INSERT INTO #DATE_SPANS 
(GROUP_ID, PLAN_ID, BEG_DT, END_DT, PLAN_LENGTH)

Group 82551399 has 3 different plan IDs that run for different periods of time. Each plan is based off of a 12 month plan year. AMT00001 has its one year shown. AMT00002 will be broken down into three plan years for 2020, 2021 & 2022.

Note: System Dates can be posted out to and beyond 2100. Often they show as 9999-12-31. AMT00003 will be broken down into two years 2020 & 2021. On a 12 month calendar year. Group 32198498 will follow the same as group 82551399 but will have an effective date starting 10-01.

Run Date for results below would be 6/7/2021

 VALUES
    ('82551399','AMT00001','2020-01-01','2020-12-31','12')
,   ('82551399','AMT00002','2020-01-01','2100-12-31','12')
,   ('82551399','AMT00003','2020-01-01','2021-12-31','12')
,   ('75491773','AMT00004','2020-01-01','2021-06-31','18')
,   ('32198498','AMD00001','2020-10-01','2021-09-30','12')
,   ('32198498','AMD00001','2021-10-01','2022-09-30','12')
,   ('32198498','AMD00002','2020-10-01','2022-09-30','12')
,   ('32198498','AMD00003','2020-10-01','2100-09-30','12')

/*

>     EXPECTED RESULT
> --Group ID 8251399
>     82551399, AMT00001, 2020-01-01
> 
>     82551399, AMT00002, 2020-01-01
>     82551399, AMT00002, 2021-01-01
>     82551399, AMT00002, 2022-01-01
> 
>     82551399, AMT00003, 2020-01-01
>     82551399, AMT00003, 2021-01-01
> 
> --Group ID 75491773
> 
> 75491773, AMT00004, 2020-01-01
> 
> --Group ID 32198498
>     32198498, AMD00001, 2020-10-01
>     32198498, AMD00001, 2021-10-01
> 
>     32198498, AMD00002, 2020-10-01 
>     32198498, AMD00002, 2021-10-01
  
>     32198498, AMD00003, 2020-10-01
>     32198498, AMD00003, 2021-10-01
>     */

Best Answer

My approach to this problem could be summarised like this:

  • For each date range, get the list of all dates within the interval that are PLAN_LENGTH months apart, starting at BEG_DT.
  • Exclude all dates that are equal to or later than today plus PLAN_LENGTH months.

If this were PostgreSQL, the logic could be rather succinctly expressed like this:

SELECT
  ds.*
, x.Date
FROM
  date_spans AS ds
, generate_series(ds.BEG_DT, ds.END_DT, (ds.PLAN_LENGTH || ' months')::interval) AS x (Date)
WHERE
  x.Date < CURRENT_TIMESTAMP::date + ds.PLAN_LENGTH * INTERVAL '1 month'
ORDER BY
  ds.group_id ASC
, ds.plan_id ASC
, x.Date ASC
;

Alas, Transact-SQL does not have generate_series or an equivalent, so I had to make do without it. I still used the above query as a general pointer, though. The following Transact-SQL solution, therefore, contains an attempt to emulate generate_series using other means, namely a numbers table (or rather a simple CTE implementation thereof) and a little bit of date arithmetic:

WITH
  Numbers (N) AS
  (
    SELECT
      hundreds.N * 100 + tens.N * 10 + ones.N * 1
    FROM
      (VALUES (0), (1), (2), (3), (4), (5), (6), (7), (8), (9)) AS ones (N)
    , (VALUES (0), (1), (2), (3), (4), (5), (6), (7), (8), (9)) AS tens (N)
    , (VALUES (0), (1), (2), (3), (4), (5), (6), (7), (8), (9)) AS hundreds (N)
  )
SELECT
  ds.*
, x.Date
FROM
  dbo.date_spans AS ds

  -- this join multiplies each range as many times as there are
  -- PLAN_LENGTH-month intervals within the range
  INNER JOIN Numbers AS n
    ON n.N BETWEEN 0 AND DATEDIFF(MONTH, ds.BEG_DT, ds.END_DT) / ds.PLAN_LENGTH

  -- this generates an actual date
  CROSS APPLY
  (
    SELECT DATEADD(YEAR, n.N, ds.BEG_DT)
  ) AS x (Date)
WHERE
  -- this additionally caps the date list at a point that is today + PLAN_LENGTH months
  x.Date < DATEADD(MONTH, ds.PLAN_LENGTH, CAST(CURRENT_TIMESTAMP AS date))
ORDER BY
  ds.group_id ASC
, ds.plan_id ASC
, n.N ASC
;

The Numbers dataset on its own can be viewed as a replacement for generate_series for (non-negative) integers. And together with the CROSS APPLY it constitutes a replacement for the timestamp-related version of that function.

For the test setup provided in the question, the above script returns this output:

group_id plan_id beg_dt end_dt plan_length Date
32198498 AMD00001 2020-10-01 2021-09-30 12 2020-10-01
32198498 AMD00001 2021-10-01 2022-09-30 12 2021-10-01
32198498 AMD00002 2020-10-01 2022-09-30 12 2020-10-01
32198498 AMD00002 2020-10-01 2022-09-30 12 2021-10-01
32198498 AMD00003 2020-10-01 2100-09-30 12 2020-10-01
32198498 AMD00003 2020-10-01 2100-09-30 12 2021-10-01
75491773 AMT00004 2020-01-01 2021-06-30 18 2020-01-01
82551399 AMT00001 2020-01-01 2020-12-31 12 2020-01-01
82551399 AMT00002 2020-01-01 2100-12-31 12 2020-01-01
82551399 AMT00002 2020-01-01 2100-12-31 12 2021-01-01
82551399 AMT00002 2020-01-01 2100-12-31 12 2022-01-01
82551399 AMT00003 2020-01-01 2021-12-31 12 2020-01-01
82551399 AMT00003 2020-01-01 2021-12-31 12 2021-01-01

A live version of the solution is available at db<>fiddle.