SQL Challenge – Sensor Threshold Exception Report

group byt-sqlwindow functions

I've added a solution without using window functions and a benchmark with a large data set below Martin's Answer

This is a followup thread to GROUP BY using columns not in the SELECT list – when is this practical, elegant or powerful?

In my solution to this challenge, I use a query that groups by an expression that is not part of the select list. This is frequently used with window functions, when the logical grouping element involves data from other rows.

Perhaps this is an overkill as an example, but I thought you may find the challenge interesting in its own right. I'll wait with posting my solution, maybe some of you can come up with better ones.

Challenge

We have a table of sensors that periodically logs reading values. There is no guarantee on sample times being in monotonous intervals.

You need to write a query that will report on the 'exceptions', meaning the times that the sensors reported out-of-threshold reading, either low or high. Each period of time the sensor was reporting over or under the threshold values, is considered an 'exception'. Once the reading got back to normal, the exception ends.

Sample tables and data

The script is in T-SQL, and is part of my training materials.

Here is a link to the SQLFiddle.

------------------------------------------
-- Sensor Thresholds - 1 - Setup Example --
------------------------------------------

CREATE TABLE [Sensors]
(
    [Sensor] NVARCHAR(10) NOT NULL,
    [Lower Threshold] DECIMAL(7,2) NOT NULL,
    [Upper Threshold] DECIMAL(7,2) NOT NULL,
    CONSTRAINT [PK Sensors] 
        PRIMARY KEY CLUSTERED ([Sensor]),
    CONSTRAINT [CK Value Range]
        CHECK ([Upper Threshold] > [Lower Threshold])
);
GO

INSERT INTO [Sensors]
( 
    [Sensor] ,
    [Lower Threshold] ,
    [Upper Threshold]
)
VALUES  (N'Sensor A', -50, 50 ),
        (N'Sensor B', 40, 80),
        (N'Sensor C', 0, 100);
GO

CREATE TABLE [Measurements]
(
    [Sensor] NVARCHAR(10) NOT NULL,
    [Measure Time] DATETIME2(0) NOT NULL,
    [Measurement] DECIMAL(7,2) NOT NULL,
    CONSTRAINT [PK Measurements] 
        PRIMARY KEY CLUSTERED ([Sensor], [Measure Time]),
    CONSTRAINT [FK Measurements Sensors] 
        FOREIGN KEY ([Sensor]) 
        REFERENCES [Sensors]([Sensor])
);
GO

INSERT INTO [Measurements]
( 
    [Sensor] ,
    [Measure Time] ,
    [Measurement]
)
VALUES  ( N'Sensor A', N'20160101 08:00', -9), 
        ( N'Sensor A', N'20160101 09:00', 30), 
        ( N'Sensor A', N'20160101 10:30', 59), 
        ( N'Sensor A', N'20160101 23:00', 66),  
        ( N'Sensor A', N'20160102 08:00', 48), 
        ( N'Sensor A', N'20160102 11:30', 08), 
        ( N'Sensor B', N'20160101 08:00', 39), -- Note that this exception range has both over and under....
        ( N'Sensor B', N'20160101 10:30', 88), 
        ( N'Sensor B', N'20160101 13:00', 75), 
        ( N'Sensor B', N'20160102 08:00', 95),  
        ( N'Sensor B', N'20160102 17:00', 75), 
        ( N'Sensor C', N'20160101 09:00', 01), 
        ( N'Sensor C', N'20160101 10:00', -1),  
        ( N'Sensor C', N'20160101 18:00', -2), 
        ( N'Sensor C', N'20160101 22:00', -2), 
        ( N'Sensor C', N'20160101 23:30', -1);
GO

Expected Result

Sensor      Exception Start Time    Exception End Time  Exception Duration (minutes)    Min Measurement Max Measurement Lower Threshold Upper Threshold Maximal Delta From Thresholds
------      --------------------    ------------------  ----------------------------    --------------- --------------- --------------- --------------- -----------------------------
Sensor A    2016-01-01 10:30:00     2016-01-02 08:00:00 1290                            59.00           66.00           -50.00          50.00           16.00
Sensor B    2016-01-01 08:00:00     2016-01-01 13:00:00 300                             39.00           88.00           40.00           80.00           8.00
Sensor B    2016-01-02 08:00:00     2016-01-02 17:00:00 540                             95.00           95.00           40.00           80.00           15.00
Sensor C    2016-01-01 10:00:00     2016-01-01 23:30:00 810                             -2.00           -1.00           0.00            100.00          -2.00
*/

Best Answer

I'd probably use something like the below.

It is able to use index order and avoid a sort until it gets to the final GROUP BY (which it uses a stream aggregate for, for me)

In principle this final grouping operation isn't actually needed. It should be possible to read an input stream ordered by Sensor, MeasureTime and output the desired results in a streaming fashion but I think you would need to write a SQLCLR procedure for that.

WITH T1
     AS (SELECT m.*,
                s.[Lower Threshold],
                s.[Upper Threshold],
                within_threshold,
                start_group_flag = IIF(within_threshold = 0 AND LAG(within_threshold, 1, 1) OVER (PARTITION BY m.[Sensor] ORDER BY [Measure Time]) = 1, 1, 0),
                next_measure_time = LEAD([Measure Time]) OVER (PARTITION BY m.[Sensor] ORDER BY [Measure Time]),
                overage = IIF(Measurement > [Upper Threshold], Measurement - [Upper Threshold], 0),
                underage =IIF(Measurement < [Lower Threshold], Measurement - [Lower Threshold], 0)
         FROM   [Measurements] m
                JOIN [Sensors] s
                  ON m.Sensor = s.Sensor
                CROSS APPLY (SELECT IIF(m.[Measurement] BETWEEN s.[Lower Threshold] AND s.[Upper Threshold],1,0)) ca(within_threshold)),
     T2
     AS (SELECT *,
                group_number = SUM(start_group_flag) OVER (PARTITION BY [Sensor] ORDER BY [Measure Time] ROWS UNBOUNDED PRECEDING)
         FROM   T1
         WHERE  within_threshold = 0)
SELECT Sensor,
       [Exception Start Time] = MIN([Measure Time]),
       [Exception End Time] = MAX(ISNULL(next_measure_time, [Measure Time])),
       [Exception Duration (minutes)] = DATEDIFF(MINUTE, MIN([Measure Time]), MAX(ISNULL(next_measure_time, [Measure Time]))),
       [Min Measurement] = MIN(Measurement),
       [Max Measurement] = MAX(Measurement),
       [Lower Threshold],
       [Upper Threshold],
       [Maximal Delta From Thresholds] = IIF(MAX(overage) > -MIN(underage), MAX(overage), MIN(underage))
FROM   T2
GROUP  BY group_number,
          Sensor,
          [Lower Threshold],
          [Upper Threshold]

Related Solutions

SQL Server 2008 Error – Parallel Data Warehouse Features Not Enabled

The Parallel Data Warehouse (PDW) features are not enabled.

This is a parser bug that exists only in SQL Server 2008. Non-PDW versions of SQL Server before 2012 do not support the ORDER BY clause with aggregate functions like MIN:

Books Online extract

Windowing function support was considerably extended in 2012, compared with the basic implementation available starting with SQL Server 2005. The extensions were made available in Parallel Data Warehouse before being incorporated in the box product. Because the various editions share a common code-base, misleading error messages like this are possible.

If you are interested, the call stack when the aggregate is verified by the parser is shown below. Because the aggregate has an OVER clause with ORDER BY, a check for PDW is issued:

Aggregate verification

This check immediately fails with a parser error:

Parser error

Luckily, you do not need an windowed aggregate that supports ORDER BY framing to solve your code problem.

T-sql – build users availability report/Table (using sql)

I think in think the use of a tally table would work in this case. There are 1440 minutes in a day, so a tally table would need a minimum of 1440 records in order to work. Using the tally table, you can utilize the DATEADD function to add one minute between your start times and your end times. This would pull back one record per minute for the interval.

/* create the tables variables */
DECLARE @Tally TABLE
    (MinuteOfDay INT)

DECLARE @Users TABLE
    (UserName  VARCHAR(100)
    ,StartDatetime DATETIME 
    ,EndDateTime DATETIME
    )


/* Populate the user table */
INSERT INTO @Users
        ( UserName ,
          StartDatetime ,
          EndDateTime
        )
VALUES  ( 'User_A' , -- UserName - varchar(100)
          '11-10-2015 12:00' , -- StartDatetime - datetime
          '11-10-2015 12:03'  -- EndDateTime - datetime
        )

/* Populate the tally table with 1 minute for 24 hours */
INSERT INTO @Tally
        ( MinuteOfDay )
SELECT TOP 1440
    ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) -1 AS Row_No
FROM sys.all_objects AS ao


/* Query to pull the result for each minute */
SELECT * FROM @Users AS u
INNER JOIN @Tally AS t ON DATEADD(MINUTE, t.MinuteOfDay,u.StartDatetime) BETWEEN u.StartDatetime AND u.EndDateTime