Sql-server – Days in between two dates group by month

sql serversql-server-2016t-sql

I'm trying to create a lost time report in SSRS. I found a script that gives me what I need for a single date range, but I need this for multiple dates, and I need the total for the range.

So if I have 6 date ranges, I just need the total for the month/year combination and type combination (either L or R).

DECLARE @s SMALLDATETIME, @e SMALLDATETIME;
SELECT  @s = '20161209',  @e = '20170113';
    ;WITH n(n) AS
(
  SELECT TOP (DATEDIFF(MONTH, @s, @e)+1) ROW_NUMBER() OVER 
  (ORDER BY [object_id])-1 FROM sys.all_objects
),
x(n,fd,ld) AS 
(
  SELECT n.n, DATEADD(MONTH, n.n, m.m), DATEADD(MONTH, n.n+1, m.m)
  FROM n, (SELECT DATEADD(DAY, 1-DAY(@s), @s)) AS m(m)
)
SELECT [Month] = DATENAME(MONTH, fd), [Days] = DATEDIFF(DAY, fd, ld) 
  - CASE WHEN @s > fd THEN (DATEDIFF(DAY, fd, @s)+1) ELSE 0 END
  - CASE WHEN @e < ld THEN (DATEDIFF(DAY, @e, ld)-1) ELSE 0 END
  FROM x;

My table is called HRAL and the main fields are BeginDate (inclusive), EndDate (exclusive) and Type.

Sample Data

StartDate   EndDate    Type
2017-09-28  2017-10-02 L
2017-10-03  2017-10-10 R
2016-11-10  2016-11-11 L
2017-08-17  2017-12-25 R

Results

Date        Days  Type
2017-09-01  3     L
2017-10-01  1     L
2017-10-01  38    R
2017-11-01  1     L
2017-08-01  15    R
2017-09-01  30    R
2017-11-01  30    R
2017-12-01  24    R

I assume the solution needs recursion?

Best Answer

First I'll create a very abbreviated version of a calendar table that stores just month start and end dates.

CREATE TABLE #LAZY_DATE_DIM (
    MONTH_START_DATE DATETIME,
    MONTH_END_DATE DATETIME,
    PRIMARY KEY (MONTH_START_DATE)
);

INSERT INTO #LAZY_DATE_DIM WITH (TABLOCK)
SELECT DATEADD(MONTH, ROW_NUMBER() OVER (ORDER BY (SELECT NULL)), '18991201')
, DATEADD(DAY, -1, DATEADD(MONTH, ROW_NUMBER() OVER (ORDER BY (SELECT NULL)), '19000101' ))
FROM master..spt_values;

You may find it useful to create a permanent calendar table in your database, but there are ways to generate the data that you need on the fly if that isn't an option.

Next step is to mock up your sample data:

CREATE TABLE #HRAL (
    StartDate DATETIME,
    EndDate DATETIME,
    [Type] VARCHAR(1)
);

INSERT INTO #HRAL VALUES ('2017-09-28','2017-10-02','L');
INSERT INTO #HRAL VALUES ('2017-10-03','2017-10-10','R');
INSERT INTO #HRAL VALUES ('2016-11-10','2016-11-11','L');
INSERT INTO #HRAL VALUES ('2017-08-17','2017-12-25','R');

I broke down the overall query into a few different steps. First step is to join the tables together such that we get a row for every relevant month that intersects with the start and end dates in your HRAL table. I used the following code:

SELECT
  h.*
, dd.*
FROM #HRAL h
INNER JOIN #LAZY_DATE_DIM dd ON 
    dd.MONTH_START_DATE > DATEADD(MONTH, -1, h.StartDate)
    AND dd.MONTH_START_DATE < h.EndDate

The intermediate results:

╔═════════════════════════╦═════════════════════════╦══════╦═════════════════════════╦═════════════════════════╗
║        StartDate        ║         EndDate         ║ Type ║    MONTH_START_DATE     ║     MONTH_END_DATE      ║
╠═════════════════════════╬═════════════════════════╬══════╬═════════════════════════╬═════════════════════════╣
║ 2017-09-28 00:00:00.000 ║ 2017-10-02 00:00:00.000 ║ L    ║ 2017-09-01 00:00:00.000 ║ 2017-09-30 00:00:00.000 ║
║ 2017-09-28 00:00:00.000 ║ 2017-10-02 00:00:00.000 ║ L    ║ 2017-10-01 00:00:00.000 ║ 2017-10-31 00:00:00.000 ║
║ 2017-10-03 00:00:00.000 ║ 2017-10-10 00:00:00.000 ║ R    ║ 2017-10-01 00:00:00.000 ║ 2017-10-31 00:00:00.000 ║
║ 2016-11-10 00:00:00.000 ║ 2016-11-11 00:00:00.000 ║ L    ║ 2016-11-01 00:00:00.000 ║ 2016-11-30 00:00:00.000 ║
║ 2017-08-17 00:00:00.000 ║ 2017-12-25 00:00:00.000 ║ R    ║ 2017-08-01 00:00:00.000 ║ 2017-08-31 00:00:00.000 ║
║ 2017-08-17 00:00:00.000 ║ 2017-12-25 00:00:00.000 ║ R    ║ 2017-09-01 00:00:00.000 ║ 2017-09-30 00:00:00.000 ║
║ 2017-08-17 00:00:00.000 ║ 2017-12-25 00:00:00.000 ║ R    ║ 2017-10-01 00:00:00.000 ║ 2017-10-31 00:00:00.000 ║
║ 2017-08-17 00:00:00.000 ║ 2017-12-25 00:00:00.000 ║ R    ║ 2017-11-01 00:00:00.000 ║ 2017-11-30 00:00:00.000 ║
║ 2017-08-17 00:00:00.000 ║ 2017-12-25 00:00:00.000 ║ R    ║ 2017-12-01 00:00:00.000 ║ 2017-12-31 00:00:00.000 ║
╚═════════════════════════╩═════════════════════════╩══════╩═════════════════════════╩═════════════════════════╝

We can find the overlapping days by taking the maximum of the two start dates, the minimum of the two end days (adding one to the month end date from the calendar table), and using DATEDIFF to find the difference of days. Below is the query that shows some of the intermediate column values:

SELECT
  dd.MONTH_START_DATE
, h.StartDate
, start_dt.dt calc_start_dt
, h.[Type]
, dd.MONTH_END_DATE
, h.EndDate
, end_dt.dt calc_end_dt
FROM #HRAL h
INNER JOIN #LAZY_DATE_DIM dd ON 
    dd.MONTH_START_DATE > DATEADD(MONTH, -1, h.StartDate)
    AND dd.MONTH_START_DATE < h.EndDate
CROSS APPLY (
    SELECT MAX(start_dt)
    FROM (VALUES (dd.MONTH_START_DATE), (h.StartDate)) x (start_dt)
) start_dt (dt)
CROSS APPLY (
    SELECT MIN(end_dt)
    FROM (VALUES (DATEADD(DAY, 1, dd.MONTH_END_DATE)), (h.EndDate)) x (end_dt)
) end_dt (dt)

The result set:

╔═════════════════════════╦═════════════════════════╦═════════════════════════╦══════╦═════════════════════════╦═════════════════════════╦═════════════════════════╗
║    MONTH_START_DATE     ║        StartDate        ║      calc_start_dt      ║ Type ║     MONTH_END_DATE      ║         EndDate         ║       calc_end_dt       ║
╠═════════════════════════╬═════════════════════════╬═════════════════════════╬══════╬═════════════════════════╬═════════════════════════╬═════════════════════════╣
║ 2017-09-01 00:00:00.000 ║ 2017-09-28 00:00:00.000 ║ 2017-09-28 00:00:00.000 ║ L    ║ 2017-09-30 00:00:00.000 ║ 2017-10-02 00:00:00.000 ║ 2017-10-01 00:00:00.000 ║
║ 2017-10-01 00:00:00.000 ║ 2017-09-28 00:00:00.000 ║ 2017-10-01 00:00:00.000 ║ L    ║ 2017-10-31 00:00:00.000 ║ 2017-10-02 00:00:00.000 ║ 2017-10-02 00:00:00.000 ║
║ 2017-10-01 00:00:00.000 ║ 2017-10-03 00:00:00.000 ║ 2017-10-03 00:00:00.000 ║ R    ║ 2017-10-31 00:00:00.000 ║ 2017-10-10 00:00:00.000 ║ 2017-10-10 00:00:00.000 ║
║ 2016-11-01 00:00:00.000 ║ 2016-11-10 00:00:00.000 ║ 2016-11-10 00:00:00.000 ║ L    ║ 2016-11-30 00:00:00.000 ║ 2016-11-11 00:00:00.000 ║ 2016-11-11 00:00:00.000 ║
║ 2017-08-01 00:00:00.000 ║ 2017-08-17 00:00:00.000 ║ 2017-08-17 00:00:00.000 ║ R    ║ 2017-08-31 00:00:00.000 ║ 2017-12-25 00:00:00.000 ║ 2017-09-01 00:00:00.000 ║
║ 2017-09-01 00:00:00.000 ║ 2017-08-17 00:00:00.000 ║ 2017-09-01 00:00:00.000 ║ R    ║ 2017-09-30 00:00:00.000 ║ 2017-12-25 00:00:00.000 ║ 2017-10-01 00:00:00.000 ║
║ 2017-10-01 00:00:00.000 ║ 2017-08-17 00:00:00.000 ║ 2017-10-01 00:00:00.000 ║ R    ║ 2017-10-31 00:00:00.000 ║ 2017-12-25 00:00:00.000 ║ 2017-11-01 00:00:00.000 ║
║ 2017-11-01 00:00:00.000 ║ 2017-08-17 00:00:00.000 ║ 2017-11-01 00:00:00.000 ║ R    ║ 2017-11-30 00:00:00.000 ║ 2017-12-25 00:00:00.000 ║ 2017-12-01 00:00:00.000 ║
║ 2017-12-01 00:00:00.000 ║ 2017-08-17 00:00:00.000 ║ 2017-12-01 00:00:00.000 ║ R    ║ 2017-12-31 00:00:00.000 ║ 2017-12-25 00:00:00.000 ║ 2017-12-25 00:00:00.000 ║
╚═════════════════════════╩═════════════════════════╩═════════════════════════╩══════╩═════════════════════════╩═════════════════════════╩═════════════════════════╝

Now I can put it all together by using DATEDIFF along with GROUP BY. The final query:

SELECT
  dd.MONTH_START_DATE
, SUM(DATEDIFF(DAY, start_dt.dt, end_dt.dt)) [days]
, h.[Type]
FROM #HRAL h
INNER JOIN #LAZY_DATE_DIM dd ON 
    dd.MONTH_START_DATE > DATEADD(MONTH, -1, h.StartDate)
    AND dd.MONTH_START_DATE < h.EndDate
CROSS APPLY (
    SELECT MAX(start_dt)
    FROM (VALUES (dd.MONTH_START_DATE), (h.StartDate)) x (start_dt)
) start_dt (dt)
CROSS APPLY (
    SELECT MIN(end_dt)
    FROM (VALUES (DATEADD(DAY, 1, dd.MONTH_END_DATE)), (h.EndDate)) x (end_dt)
) end_dt (dt)
GROUP BY
  dd.MONTH_START_DATE
, h.[Type];

The results match yours:

╔═════════════════════════╦══════╦══════╗
║    MONTH_START_DATE     ║ days ║ Type ║
╠═════════════════════════╬══════╬══════╣
║ 2016-11-01 00:00:00.000 ║    1 ║ L    ║
║ 2017-09-01 00:00:00.000 ║    3 ║ L    ║
║ 2017-10-01 00:00:00.000 ║    1 ║ L    ║
║ 2017-08-01 00:00:00.000 ║   15 ║ R    ║
║ 2017-09-01 00:00:00.000 ║   30 ║ R    ║
║ 2017-10-01 00:00:00.000 ║   38 ║ R    ║
║ 2017-11-01 00:00:00.000 ║   30 ║ R    ║
║ 2017-12-01 00:00:00.000 ║   24 ║ R    ║
╚═════════════════════════╩══════╩══════╝

If you prefer to avoid the CROSS APPLY syntax you can accomplish the same thing with a CASE expression.