Sql-server – Find a date range from rows with consecutive dates and identical column values

datepartitioningsql server

A similar question has been asked before but I've been having trouble adapting those answers to my question.

I want to find rows on consecutive dates, where another column has the same value, and return that as a single date range.

I have tried using OVER and PARTITION BY along with ROW_NUMBER() to get the results I need, but I'm not familiar enough with those concepts to figure out the correct output.

The system is Microsoft SQL Server 2014.

Given:

CREATE TABLE EXCEPTIONS
(
  ID    NUMERIC(18) NOT NULL,
  DATE  DATE        NOT NULL,
  TYPE  VARCHAR(20) NOT NULL,
  VALUE VARCHAR(20),
);

and the data:

INSERT INTO EXCEPTIONS VALUES 
(17482, '2016-08-24', 'PRESSURE_EXCEPTION', 'Over'),
(17482, '2016-08-25', 'PRESSURE_EXCEPTION', 'Over'), 
(17482, '2016-08-26', 'PRESSURE_EXCEPTION', 'Over'), 
(17482, '2016-09-04', 'PRESSURE_EXCEPTION', 'Over'), 
(17482, '2016-09-05', 'PRESSURE_EXCEPTION', 'Over'), 
(17482, '2016-09-16', 'PRESSURE_EXCEPTION', 'Under'), 
(17482, '2016-09-17', 'PRESSURE_EXCEPTION', 'Under'), 
(17482, '2016-09-23', 'PRESSURE_EXCEPTION', 'Under'), 
(17482, '2016-09-24', 'PRESSURE_EXCEPTION', 'Under'), 
(17482, '2016-09-25', 'PRESSURE_EXCEPTION', 'Under'), 
(17482, '2016-09-27', 'VOLUME_EXCEPTION', 'Low'), 
(17482, '2016-09-28', 'VOLUME_EXCEPTION', 'Low')

The result of this query should be something like

StartDate  | EndDate    | Type                 | Value
2016-08-24 | 2016-08-26 | 'PRESSURE_EXCEPTION' | 'Over'
2016-09-04 | 2016-09-05 | 'PRESSURE_EXCEPTION' | 'Over'
2016-09-16 | 2016-09-17 | 'PRESSURE_EXCEPTION' | 'Under'
2016-09-23 | 2016-09-25 | 'PRESSURE_EXCEPTION' | 'Under'
2016-09-27 | 2016-09-28 | 'VOLUME_EXCEPTION'   | 'Low'

SQLFiddle available with more data

Best Answer

This is a 'Gaps and Islands' problem. I took this example and tweaked it for your situation.

DECLARE @TestData TABLE (
    ID NUMERIC(18) NOT NULL
    ,[DATE] DATE NOT NULL
    ,[TYPE] VARCHAR(20) NOT NULL
    ,[VALUE] VARCHAR(20)
    )
INSERT INTO @TestData VALUES 
(17482, '2016-08-24', 'PRESSURE_EXCEPTION', 'Over'),
(17482, '2016-08-25', 'PRESSURE_EXCEPTION', 'Over'), 
(17482, '2016-08-26', 'PRESSURE_EXCEPTION', 'Over'), 
(17482, '2016-09-04', 'PRESSURE_EXCEPTION', 'Over'), 
(17482, '2016-09-05', 'PRESSURE_EXCEPTION', 'Over'), 
(17482, '2016-09-16', 'PRESSURE_EXCEPTION', 'Under'), 
(17482, '2016-09-17', 'PRESSURE_EXCEPTION', 'Under'), 
(17482, '2016-09-23', 'PRESSURE_EXCEPTION', 'Under'), 
(17482, '2016-09-24', 'PRESSURE_EXCEPTION', 'Under'), 
(17482, '2016-09-25', 'PRESSURE_EXCEPTION', 'Under'), 
(17482, '2016-09-27', 'VOLUME_EXCEPTION', 'Low'), 
(17482, '2016-09-28', 'VOLUME_EXCEPTION', 'Low')
;

WITH mycte
AS (
    SELECT *
        ,DATEADD(DAY, - ROW_NUMBER() OVER (
                PARTITION BY [Value] ORDER BY [Date]
                ), [Date]) AS grp
    FROM @TestData
    )
--select * from mycte       --Uncomment to see the data from mycte
SELECT min([Date]) AS [From]
    ,max([Date]) AS [To]
    ,[Type]
    ,[value]
FROM mycte
GROUP BY [Type]
    ,[value]
    ,grp
ORDER BY [From];

| From       | To         | Type               | value |
|------------|------------|--------------------|-------|
| 2016-08-24 | 2016-08-26 | PRESSURE_EXCEPTION | Over  |
| 2016-09-04 | 2016-09-05 | PRESSURE_EXCEPTION | Over  |
| 2016-09-16 | 2016-09-17 | PRESSURE_EXCEPTION | Under |
| 2016-09-23 | 2016-09-25 | PRESSURE_EXCEPTION | Under |
| 2016-09-27 | 2016-09-28 | VOLUME_EXCEPTION   | Low   |

The OVER/PARTITION logic is basically assigning each row to a 'group' so you can pick out the MIN and MAX from each group. If you un-comment the select * from mycte (I added an order by date) and run the script from that part to the top, you'll see that each row gets assigned to a group.

Look at the first group of rows assigned to '2016-08-23'. The ROW_NUMBER for date '2016-08-24' is 1, so the DATEADD subtracts 1 from '2016-08-24' to put that row in group '2016-08-23'. The ROW_NUMBER for date '2016-08-25' is 2, so the DATEADD subtracts 2 from '2016-08-25' to put it in the same group as row number 1 and so on.

| ID    | DATE       | TYPE               | VALUE | grp        |
|-------|------------|--------------------|-------|------------|
| 17482 | 2016-08-24 | PRESSURE_EXCEPTION | Over  | 2016-08-23 |
| 17482 | 2016-08-25 | PRESSURE_EXCEPTION | Over  | 2016-08-23 |
| 17482 | 2016-08-26 | PRESSURE_EXCEPTION | Over  | 2016-08-23 |
| 17482 | 2016-09-04 | PRESSURE_EXCEPTION | Over  | 2016-08-31 |
| 17482 | 2016-09-05 | PRESSURE_EXCEPTION | Over  | 2016-08-31 |
| 17482 | 2016-09-16 | PRESSURE_EXCEPTION | Under | 2016-09-15 |
| 17482 | 2016-09-17 | PRESSURE_EXCEPTION | Under | 2016-09-15 |
| 17482 | 2016-09-23 | PRESSURE_EXCEPTION | Under | 2016-09-20 |
| 17482 | 2016-09-24 | PRESSURE_EXCEPTION | Under | 2016-09-20 |
| 17482 | 2016-09-25 | PRESSURE_EXCEPTION | Under | 2016-09-20 |
| 17482 | 2016-09-27 | VOLUME_EXCEPTION   | Low   | 2016-09-26 |
| 17482 | 2016-09-28 | VOLUME_EXCEPTION   | Low   | 2016-09-26 |

Now, it's just a matter of pulling the MIN and MAX from each group.