First, a table and some sample data to play with:
USE tempdb;
GO
CREATE TABLE dbo.SomeTable(dt DATETIME);
GO
SET NOCOUNT ON;
GO
INSERT dbo.SomeTable(dt) SELECT DATEADD(MINUTE, -22, GETDATE());
GO 45
INSERT dbo.SomeTable(dt) SELECT DATEADD(MINUTE, -19, GETDATE());
GO 32
INSERT dbo.SomeTable(dt) SELECT DATEADD(MINUTE, -17, GETDATE());
GO 21
INSERT dbo.SomeTable(dt) SELECT DATEADD(MINUTE, -12, GETDATE());
GO 16
INSERT dbo.SomeTable(dt) SELECT DATEADD(MINUTE, -5, GETDATE());
GO 55
INSERT dbo.SomeTable(dt) SELECT DATEADD(MINUTE, -2, GETDATE());
GO 26
INSERT dbo.SomeTable(dt) SELECT DATEADD(MINUTE, -1, GETDATE());
GO 71
INSERT dbo.SomeTable(dt) SELECT GETDATE();
GO 14
(I'd do this in sqlfiddle but I'm not sure it supports GO <int>
to have many rows, and it chokes on INSERT
> 8000 characters.)
Now a stored procedure:
CREATE PROCEDURE dbo.GetGroupedIntervals
@MinuteInterval TINYINT = 1
AS
BEGIN
SET NOCOUNT ON;
DECLARE
@IntervalCount INT, @StartDate SMALLDATETIME;
SELECT
@StartDate = DATEADD(MINUTE, -1, MIN(dt)),
@IntervalCount = (DATEDIFF(MINUTE, MIN(dt), MAX(dt))
+ @MinuteInterval) / @MinuteInterval
FROM dbo.SomeTable -- WHERE ...;
;WITH dates(s,e) AS
(
SELECT
DATEADD(MINUTE, @MinuteInterval*(n.n-1), @StartDate),
DATEADD(MINUTE, @MinuteInterval*(n.n), @StartDate)
FROM
(
SELECT
TOP (@IntervalCount) ROW_NUMBER() OVER (ORDER BY o.[object_id])
FROM sys.all_objects AS o CROSS JOIN sys.all_columns AS c
ORDER BY o.[object_id]
) AS n(n)
)
SELECT StartDate = d.s, c = COUNT(s.dt)
FROM dates AS d
LEFT OUTER JOIN dbo.SomeTable AS s
ON s.dt >= d.s AND s.dt < d.e
-- AND any filter criteria for dbo.SomeTable?
GROUP BY d.s
ORDER BY d.s;
END
GO
And some sample usage:
EXEC dbo.GetGroupedIntervals @MinuteInterval = 1;
EXEC dbo.GetGroupedIntervals @MinuteInterval = 2;
EXEC dbo.GetGroupedIntervals @MinuteInterval = 5;
For brevity, I'll show the results for the last call, but you can play with the others.
StartDate c
------------------- ----
2012-05-16 12:51:00 77
2012-05-16 12:56:00 21
2012-05-16 13:01:00 16
2012-05-16 13:06:00 55
2012-05-16 13:11:00 111
Some notes:
- The join using s.dt is likely to perform better than any extraction methods using datepart if the datetime column in your base table has an index (or might in the future).
- I assumed you wanted to show all the intervals within the range. If you don't want to show the intervals with 0 counts, just change the left outer join to an inner join.
- I go down a minute in case the start date was rounded up when converted to SMALLDATETIME. For the 1 minute interval and possibly others, this can lead to 0 counts for the first interval. You can adjust how this rounding happens (e.g. you can use FLOOR() to make sure it always rounds down). All depends on how accurate you need to be.
- I didn't include any WHERE clause but you may need to have those to filter. For example your query may want all the intervals for a given day. You may want to change the calculation of
dates
to produce all the intervals for the day, rather than all the intervals between the min and max time found in dbo.SomeTable
on that day. The following variation takes care of this, by presenting the data for a single day starting from midnight and incrementing by @MinutInterval:
...
CREATE PROCEDURE dbo.GetGroupedIntervalsByDay
@Date DATE,
@MinuteInterval TINYINT = 1
AS
BEGIN
SET NOCOUNT ON;
DECLARE
@IntervalCount INT,
@StartDate SMALLDATETIME = @Date;
SELECT
@IntervalCount = 1440 / @MinuteInterval;
;WITH dates(s,e) AS
(
SELECT
DATEADD(MINUTE, @MinuteInterval*(n.n-1), @StartDate),
DATEADD(MINUTE, @MinuteInterval*(n.n), @StartDate)
FROM
(
SELECT
TOP (@IntervalCount) ROW_NUMBER() OVER (ORDER BY o.[object_id])
FROM sys.all_columns AS o
ORDER BY o.[object_id]
) AS n(n)
)
SELECT StartDate = d.s, c = COUNT(s.dt)
FROM dates AS d
LEFT OUTER JOIN dbo.SomeTable AS s
ON s.dt >= d.s AND s.dt < d.e
-- AND any filter criteria for dbo.SomeTable?
GROUP BY d.s
ORDER BY d.s;
END
GO
Sample calls:
EXEC dbo.GetGroupedIntervalsByDay @Date = '20120516', @MinuteInterval = 1;
EXEC dbo.GetGroupedIntervalsByDay @Date = '20120516', @MinuteInterval = 2;
EXEC dbo.GetGroupedIntervalsByDay @Date = '20120516', @MinuteInterval = 5;
Abridged results from the last call:
StartDate c
------------------- ----
2012-05-16 00:00:00 0
2012-05-16 00:05:00 0
2012-05-16 00:10:00 0
...
2012-05-16 12:40:00 0
2012-05-16 12:45:00 0
2012-05-16 12:50:00 45
2012-05-16 12:55:00 53
2012-05-16 13:00:00 16
2012-05-16 13:05:00 55
2012-05-16 13:10:00 111
2012-05-16 13:15:00 0
2012-05-16 13:20:00 0
...
2012-05-16 23:45:00 0
2012-05-16 23:50:00 0
2012-05-16 23:55:00 0
(Again if you don't want to include the intervals with no counts, change the left outer join to an inner join. You also might find some odd results if you choose an interval that doesn't fit nicely inside of 1440. I'll leave that case as an exercise to the reader.)
Technically to display n-th value you use the ranking functions like RANK
, DENSE_RANK
or ROW_NUMBER
. Which one exactly depends, but what you describe matches best ROW_NUMBER
:
with cte as (
select row_number() over (order by Date) as rn, *
from table)
where rn % 1000 = 0;
But you are saying that you want o do this for performance reasons, avoid going through millions of rows. Such a query had already done the damage, it read from disk all the rows and had to sort them so the performance price was already payed.
A better alternative to selecting a random sample of data is to use the TABLESAMPLE
clause, see Limiting Result Sets by Using TABLESAMPLE:
select *
from table tablesample (100 rows);
TABLESAMPLE will be way more efficient as it actually avoids reading all the data, it only samples some of the pages in the table and returns all rows in the sampled pages.
But consider that if you use TABLESAMPLE your WHERE clause are applied after the sampling. So it may be that the sample does not contain any row for the device/sensor you're interested in. This is specially true for sensor with small data.
Best Answer
Assuming seconds always are '00.000':
fiddle
Added by OP
Thank you very much, this was very helpful. BTW, because I had records with seconds and milliseconds, I had to add cast the time to "smalldatetime" which did the trick very well,