I've added a solution without using window functions and a benchmark with a large data set below Martin's Answer
This is a followup thread to GROUP BY using columns not in the SELECT list – when is this practical, elegant or powerful?
In my solution to this challenge, I use a query that groups by an expression that is not part of the select list. This is frequently used with window functions, when the logical grouping element involves data from other rows.
Perhaps this is an overkill as an example, but I thought you may find the challenge interesting in its own right. I'll wait with posting my solution, maybe some of you can come up with better ones.
Challenge
We have a table of sensors that periodically logs reading values. There is no guarantee on sample times being in monotonous intervals.
You need to write a query that will report on the 'exceptions', meaning the times that the sensors reported out-of-threshold reading, either low or high. Each period of time the sensor was reporting over or under the threshold values, is considered an 'exception'. Once the reading got back to normal, the exception ends.
Sample tables and data
The script is in T-SQL, and is part of my training materials.
Here is a link to the SQLFiddle.
------------------------------------------
-- Sensor Thresholds - 1 - Setup Example --
------------------------------------------
CREATE TABLE [Sensors]
(
[Sensor] NVARCHAR(10) NOT NULL,
[Lower Threshold] DECIMAL(7,2) NOT NULL,
[Upper Threshold] DECIMAL(7,2) NOT NULL,
CONSTRAINT [PK Sensors]
PRIMARY KEY CLUSTERED ([Sensor]),
CONSTRAINT [CK Value Range]
CHECK ([Upper Threshold] > [Lower Threshold])
);
GO
INSERT INTO [Sensors]
(
[Sensor] ,
[Lower Threshold] ,
[Upper Threshold]
)
VALUES (N'Sensor A', -50, 50 ),
(N'Sensor B', 40, 80),
(N'Sensor C', 0, 100);
GO
CREATE TABLE [Measurements]
(
[Sensor] NVARCHAR(10) NOT NULL,
[Measure Time] DATETIME2(0) NOT NULL,
[Measurement] DECIMAL(7,2) NOT NULL,
CONSTRAINT [PK Measurements]
PRIMARY KEY CLUSTERED ([Sensor], [Measure Time]),
CONSTRAINT [FK Measurements Sensors]
FOREIGN KEY ([Sensor])
REFERENCES [Sensors]([Sensor])
);
GO
INSERT INTO [Measurements]
(
[Sensor] ,
[Measure Time] ,
[Measurement]
)
VALUES ( N'Sensor A', N'20160101 08:00', -9),
( N'Sensor A', N'20160101 09:00', 30),
( N'Sensor A', N'20160101 10:30', 59),
( N'Sensor A', N'20160101 23:00', 66),
( N'Sensor A', N'20160102 08:00', 48),
( N'Sensor A', N'20160102 11:30', 08),
( N'Sensor B', N'20160101 08:00', 39), -- Note that this exception range has both over and under....
( N'Sensor B', N'20160101 10:30', 88),
( N'Sensor B', N'20160101 13:00', 75),
( N'Sensor B', N'20160102 08:00', 95),
( N'Sensor B', N'20160102 17:00', 75),
( N'Sensor C', N'20160101 09:00', 01),
( N'Sensor C', N'20160101 10:00', -1),
( N'Sensor C', N'20160101 18:00', -2),
( N'Sensor C', N'20160101 22:00', -2),
( N'Sensor C', N'20160101 23:30', -1);
GO
Expected Result
Sensor Exception Start Time Exception End Time Exception Duration (minutes) Min Measurement Max Measurement Lower Threshold Upper Threshold Maximal Delta From Thresholds
------ -------------------- ------------------ ---------------------------- --------------- --------------- --------------- --------------- -----------------------------
Sensor A 2016-01-01 10:30:00 2016-01-02 08:00:00 1290 59.00 66.00 -50.00 50.00 16.00
Sensor B 2016-01-01 08:00:00 2016-01-01 13:00:00 300 39.00 88.00 40.00 80.00 8.00
Sensor B 2016-01-02 08:00:00 2016-01-02 17:00:00 540 95.00 95.00 40.00 80.00 15.00
Sensor C 2016-01-01 10:00:00 2016-01-01 23:30:00 810 -2.00 -1.00 0.00 100.00 -2.00
*/
Best Answer
I'd probably use something like the below.
It is able to use index order and avoid a sort until it gets to the final
GROUP BY
(which it uses a stream aggregate for, for me)In principle this final grouping operation isn't actually needed. It should be possible to read an input stream ordered by
Sensor, MeasureTime
and output the desired results in a streaming fashion but I think you would need to write a SQLCLR procedure for that.