SQL Challenge – Sensor Threshold Exception Report

group byt-sqlwindow functions

I've added a solution without using window functions and a benchmark with a large data set below Martin's Answer

This is a followup thread to GROUP BY using columns not in the SELECT list – when is this practical, elegant or powerful?

In my solution to this challenge, I use a query that groups by an expression that is not part of the select list. This is frequently used with window functions, when the logical grouping element involves data from other rows.

Perhaps this is an overkill as an example, but I thought you may find the challenge interesting in its own right. I'll wait with posting my solution, maybe some of you can come up with better ones.

Challenge

We have a table of sensors that periodically logs reading values. There is no guarantee on sample times being in monotonous intervals.

You need to write a query that will report on the 'exceptions', meaning the times that the sensors reported out-of-threshold reading, either low or high. Each period of time the sensor was reporting over or under the threshold values, is considered an 'exception'. Once the reading got back to normal, the exception ends.

Sample tables and data

The script is in T-SQL, and is part of my training materials.

Here is a link to the SQLFiddle.

------------------------------------------
-- Sensor Thresholds - 1 - Setup Example --
------------------------------------------

CREATE TABLE [Sensors]
(
    [Sensor] NVARCHAR(10) NOT NULL,
    [Lower Threshold] DECIMAL(7,2) NOT NULL,
    [Upper Threshold] DECIMAL(7,2) NOT NULL,
    CONSTRAINT [PK Sensors] 
        PRIMARY KEY CLUSTERED ([Sensor]),
    CONSTRAINT [CK Value Range]
        CHECK ([Upper Threshold] > [Lower Threshold])
);
GO

INSERT INTO [Sensors]
( 
    [Sensor] ,
    [Lower Threshold] ,
    [Upper Threshold]
)
VALUES  (N'Sensor A', -50, 50 ),
        (N'Sensor B', 40, 80),
        (N'Sensor C', 0, 100);
GO

CREATE TABLE [Measurements]
(
    [Sensor] NVARCHAR(10) NOT NULL,
    [Measure Time] DATETIME2(0) NOT NULL,
    [Measurement] DECIMAL(7,2) NOT NULL,
    CONSTRAINT [PK Measurements] 
        PRIMARY KEY CLUSTERED ([Sensor], [Measure Time]),
    CONSTRAINT [FK Measurements Sensors] 
        FOREIGN KEY ([Sensor]) 
        REFERENCES [Sensors]([Sensor])
);
GO

INSERT INTO [Measurements]
( 
    [Sensor] ,
    [Measure Time] ,
    [Measurement]
)
VALUES  ( N'Sensor A', N'20160101 08:00', -9), 
        ( N'Sensor A', N'20160101 09:00', 30), 
        ( N'Sensor A', N'20160101 10:30', 59), 
        ( N'Sensor A', N'20160101 23:00', 66),  
        ( N'Sensor A', N'20160102 08:00', 48), 
        ( N'Sensor A', N'20160102 11:30', 08), 
        ( N'Sensor B', N'20160101 08:00', 39), -- Note that this exception range has both over and under....
        ( N'Sensor B', N'20160101 10:30', 88), 
        ( N'Sensor B', N'20160101 13:00', 75), 
        ( N'Sensor B', N'20160102 08:00', 95),  
        ( N'Sensor B', N'20160102 17:00', 75), 
        ( N'Sensor C', N'20160101 09:00', 01), 
        ( N'Sensor C', N'20160101 10:00', -1),  
        ( N'Sensor C', N'20160101 18:00', -2), 
        ( N'Sensor C', N'20160101 22:00', -2), 
        ( N'Sensor C', N'20160101 23:30', -1);
GO

Expected Result

Sensor      Exception Start Time    Exception End Time  Exception Duration (minutes)    Min Measurement Max Measurement Lower Threshold Upper Threshold Maximal Delta From Thresholds
------      --------------------    ------------------  ----------------------------    --------------- --------------- --------------- --------------- -----------------------------
Sensor A    2016-01-01 10:30:00     2016-01-02 08:00:00 1290                            59.00           66.00           -50.00          50.00           16.00
Sensor B    2016-01-01 08:00:00     2016-01-01 13:00:00 300                             39.00           88.00           40.00           80.00           8.00
Sensor B    2016-01-02 08:00:00     2016-01-02 17:00:00 540                             95.00           95.00           40.00           80.00           15.00
Sensor C    2016-01-01 10:00:00     2016-01-01 23:30:00 810                             -2.00           -1.00           0.00            100.00          -2.00
*/

Best Answer

I'd probably use something like the below.

It is able to use index order and avoid a sort until it gets to the final GROUP BY (which it uses a stream aggregate for, for me)

In principle this final grouping operation isn't actually needed. It should be possible to read an input stream ordered by Sensor, MeasureTime and output the desired results in a streaming fashion but I think you would need to write a SQLCLR procedure for that.

WITH T1
     AS (SELECT m.*,
                s.[Lower Threshold],
                s.[Upper Threshold],
                within_threshold,
                start_group_flag = IIF(within_threshold = 0 AND LAG(within_threshold, 1, 1) OVER (PARTITION BY m.[Sensor] ORDER BY [Measure Time]) = 1, 1, 0),
                next_measure_time = LEAD([Measure Time]) OVER (PARTITION BY m.[Sensor] ORDER BY [Measure Time]),
                overage = IIF(Measurement > [Upper Threshold], Measurement - [Upper Threshold], 0),
                underage =IIF(Measurement < [Lower Threshold], Measurement - [Lower Threshold], 0)
         FROM   [Measurements] m
                JOIN [Sensors] s
                  ON m.Sensor = s.Sensor
                CROSS APPLY (SELECT IIF(m.[Measurement] BETWEEN s.[Lower Threshold] AND s.[Upper Threshold],1,0)) ca(within_threshold)),
     T2
     AS (SELECT *,
                group_number = SUM(start_group_flag) OVER (PARTITION BY [Sensor] ORDER BY [Measure Time] ROWS UNBOUNDED PRECEDING)
         FROM   T1
         WHERE  within_threshold = 0)
SELECT Sensor,
       [Exception Start Time] = MIN([Measure Time]),
       [Exception End Time] = MAX(ISNULL(next_measure_time, [Measure Time])),
       [Exception Duration (minutes)] = DATEDIFF(MINUTE, MIN([Measure Time]), MAX(ISNULL(next_measure_time, [Measure Time]))),
       [Min Measurement] = MIN(Measurement),
       [Max Measurement] = MAX(Measurement),
       [Lower Threshold],
       [Upper Threshold],
       [Maximal Delta From Thresholds] = IIF(MAX(overage) > -MIN(underage), MAX(overage), MIN(underage))
FROM   T2
GROUP  BY group_number,
          Sensor,
          [Lower Threshold],
          [Upper Threshold] 

enter image description here