I have a table which stores the sequence (counter) of data received from devices out in the field. At any rate, these sequences need to be in order within a configurable time span, but can come into the system out of order. If a device is reset, then it's sequence number is set back to 0.
CREATE TABLE [dbo].[TapGapDetail]
(
[Id] [bigint] NOT NULL, --FK to another table
[DeviceESN] [varchar] (100) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL,
[TapDateUTC] [datetime] NOT NULL, --date event occurred on device
[CreatedDateUTC] [datetime] NOT NULL,
[Counter] [int] NOT NULL,
)
CREATE CLUSTERED INDEX [CX_TapGapDetail] ON [dbo].[TapGapDetail] ([DeviceESN], [CreatedDateUTC], [Counter]) ON [PRIMARY]
GO
I've tried to switch the order the CX by DeviceESN and CreatedDateUTC, but on my system it doesn't seem to make much of a difference in the IO. There are millions of rows in this table.
Example insert would be:
INSERT INTO TapGapDetail (1, 'A', '1/1/2012 01:00AM', '1/1/2012 01:10AM', 5)
INSERT INTO TapGapDetail (2, 'A', '1/1/2012 12:05AM', '1/1/2012 01:15AM', 4) --out of order by insert date
INSERT INTO TapGapDetail (3, 'A', '1/2/2012 12:00AM', '1/2/2012 12:05AM', 6) --back in order
INSERT INTO TapGapDetail (4, 'A', '1/3/2012 01:00AM', '1/3/2012 01:05AM', 8) --missing 7
INSERT INTO TapGapDetail (5, 'A', '1/3/2012 06:00AM', '1/3/2012 06:05AM', 9) --in order outside 'tolerance'
INSERT INTO TapGapDetail (6, 'A', '1/10/2012 06:00AM', '1/10/2012 06:05AM', 0) --device reset
The report needs to report that a gap has occurred once for this data. There is also a detail, but one step at a time. I want to get it to run in less than 10 seconds, but seems to take at least 1.25 minutes on my local. I have stat IO if needed. So, the proc I created to get the "gaps" is like so:
CREATE PROC [dbo].[GetValidatorTapGapSummary]
@Validator varchar(100) = ''
,@FilterDateUtc datetime
,@ToleranceHours int
AS
--temp table
SELECT
RowID = ROW_NUMBER() OVER (ORDER BY DeviceESN, CreatedDateUTC, [Counter]),
DeviceESN,
TapDateUTC,
CreatedDateUTC,
[Counter]
INTO #taps
FROM TapGapDetail
WHERE CreatedDateUTC >= @FilterDateUtc
Order By 1
--cx
CREATE CLUSTERED INDEX CX1 ON #taps(RowID)
--results
select
t.DeviceESN as Validator
,sum(
case
--They are in sequence
when t2.[Counter] = t.[Counter]+ 1 then 0
--A reset has occured
when t2.[Counter] < t.[Counter] then 0
--A gap exists. Find the difference
else (t2.[Counter] - t.[Counter] - 1)
end)
as TapGaps
,case
--gets the last tap date per validator
when MAX(t.TapDateUTC) > MAX(t2.TapDateUTC) THEN MAX(T.TapDateUTC)
ELSE MAX(t2.TapDateUTC)
END
AS MaxTapDate
,case
--gets the last tap date per validator
when MAX(t.CreatedDateUTC) > MAX(t2.CreatedDateUTC) THEN MAX(T.CreatedDateUTC)
ELSE MAX(t2.CreatedDateUTC)
END
AS MaxCreatedDate
from #Taps t
inner join #Taps t2 on t.DeviceESN = t2.DeviceESN and t2.RowID = t.RowID + 1
where t2.[Counter] != t.[Counter] + 1
and t2.[Counter] > t.[Counter]
And t.CreatedDateUTC >= @FilterDateUtc
And t2.CreatedDateUTC >= @FilterDateUtc
And (t.DeviceESN = @Validator Or @Validator = '')
And Not Exists --edge case for when there is a gap at the end, tried with left join and stats are the same, so this is easier to read I think.
(Select Top 1 Null From TapGapDetail tgd Where tgd.DeviceEsn = t.DeviceESN And (t.Counter + 1) = tgd.Counter
And tgd.CreatedDateUTC >= DateAdd(day, -1 * @ToleranceHours, t.CreatedDateUTC)
And tgd.CreatedDateUTC <= DateAdd(day, @ToleranceHours, t.CreatedDateUTC))
group by t.DeviceESN
Order By MaxCreatedDate Desc, t.DeviceEsn
GO
An example STATIO would be:
Table 'TapGapDetail'. Scan count 5, logical reads 24060, physical reads 400, read-ahead reads 4590, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
(2763981 row(s) affected)
(1 row(s) affected)
Table '#taps_______________________________________________________________________________________________________________000000000051'. Scan count 1, logical reads 18246, physical reads 0, read-ahead reads 251, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
(154 row(s) affected)
Table '#taps_______________________________________________________________________________________________________________000000000051'. Scan count 10, logical reads 37834, physical reads 0, read-ahead reads 1240, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'TapGapDetail'. Scan count 5, logical reads 13563, physical reads 3, read-ahead reads 13445, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
I've tried various indexes and what not on the temp table, but I suspect that even just filling is taking too long. Any suggestions are appreciated.
Best Answer
Use a different method.
For a start, don't populate a temporary table with 2.7M rows - that's not going to want to return in under ten seconds. You could use a CTE instead, and that might work much better.
I'm not saying this is necessarily terrific (the "NOT EXISTS" bit could be a pain), but it's almost certainly an improvement.