I am still new to Query optimisation, and I have a Stored Procedure which uses a Cursor to go through each row within the Table, and performs the following operations:
- Calculate the Time Difference between each Row
- Calculate Distance Between each Row
- If Distance < 5 AND TimeDifference > 3 minutes THEN ADD to TEMP TABLE
I tried converting this Cursor to a WHILE Loop but performance decreased.
So I need help converting this into a SET BASED
approach instead of a Procedural Based
Approach
So the Cursor performs this Logic:
-- READ Current Row into Cursor Variables
FETCH NEXT FROM crAssetIgnitionOnOff INTO
@current_iVehicleMonitoringID
, @current_iAssetID
, @current_dtUTCDateTime
, @current_sptGeoLocationPoint
, @current_fLatitude
, @current_fLongitude
, @current_fAngle
, @current_fSpeedKPH
, @current_sIgnitionStatus
, @current_eEventCode
, @current_sEventCode
IF(@current_iAssetID = @prev_iAssetID)
BEGIN
---- Calculate Time Difference from previous Point
DECLARE @diffInSeconds INT
SET @diffInSeconds = DATEDIFF(SECOND, @prev_dtUTCDateTime, @current_dtUTCDateTime)
DECLARE @diffInMinutes INT
SET @diffInMinutes = @diffInSeconds / 60
-- Calcualte the Distance from previous position
DECLARE @tempDistance FLOAT;
SELECT @tempDistance = @current_sptGeoLocaitonPoint.STDistance(@prev_sptGeoLocaitonPoint);
-- Check if distance travelled less than 5, AND Time difference between points greater than user selected Idle Minutes (@iIdleMinutes) AND prev ignition status = On
IF(@diffInSeconds > @iIdleMinutes AND @tempDistance < 5 AND @prev_sIgnitionStatus = 'On')
BEGIN
DECLARE @sTime VARCHAR(30)
SELECT @sTime = dbo.xPT_ConvertTimeToDDHHMMSS(@diffInSeconds,'s')
INSERT INTO @tblExcessiveIdleTime(
AssetID,
PreviousDate,
CurrentDate,
TimeString,
TimeInSeconds
)
VALUES
(
@current_iAssetId,
@prev_dtUTCDateTime,
@current_dtUTCDateTime,
@sTime,
@diffInSeconds
)
END
END
-- Set Previous Values End of Loop
SET @prev_iVehicleMonitoringID = @current_iVehicleMonitoringID
SET @prev_iAssetID = @current_iAssetID
SET @prev_dtUTCDateTime = @current_dtUTCDateTime
SET @prev_sptGeoLocationPoint = @current_sptGeoLocationPoint
SET @prev_fLatitude = @current_fLatitude
SET @prev_fLongitude = @current_fLongitude
SET @prev_fAngle = @current_fAngle
SET @prev_fSpeedKPH = @current_fSpeedKPH
SET @prev_sIgnitionStatus = @current_sIgnitionStatus
SET @prev_eEventCode = @current_eEventCode
SET @prev_sEventCode = @current_sEventCode
END
Now this takes 17 minutes to Execute in some cases, so I tried converting it to a WHILE Loop – (http://www.sqlbook.com/SQL/Avoiding-using-SQL-Cursors-20.aspx)
Which was not a good idea – as it's performance with Number of logical reads was 4 times that of the Cursor. And it took longer to process:
WHILE @RowCount <= @NumberRecords
BEGIN
-- Check for First Row
IF @RowCount = 1
BEGIN
-- Set First Row as Previous
SELECT @previous_iAssetID = iAssetID, @previous_sptGeoLocaitonPoint = sptGeoLocaitonPoint, @previous_dtUTCDateTime = dtUTCDateTime, @previous_sIgnitionStatus = sIgnitionStatus
FROM #tblVehicleMonitoringLog WHERE RowID = @RowCount
END
ELSE
BEGIN
/* Select current Row */
SELECT @current_iAssetID = iAssetID, @current_sptGeoLocaitonPoint = sptGeoLocaitonPoint, @current_dtUTCDateTime = dtUTCDateTime, @current_sIgnitionStatus = sIgnitionStatus
FROM #tblVehicleMonitoringLog WHERE RowID = @RowCount
/******** IMPLEMENT REPORT LOGIC **********/
IF(@current_iAssetID = @previous_iAssetID)
BEGIN
---- Calculate Time Difference from previous Point
DECLARE @diffInSeconds INT
SET @diffInSeconds = DATEDIFF(SECOND, @previous_dtUTCDateTime, @current_dtUTCDateTime)
DECLARE @diffInMinutes INT
SET @diffInMinutes = @diffInSeconds / 60
-- Calcualte the Distance from previous position
DECLARE @tempDistance FLOAT;
SELECT @tempDistance = @current_sptGeoLocaitonPoint.STDistance(@previous_sptGeoLocaitonPoint);
-- Check if distance travelled less than 5, AND Time difference between points greater than user selected Idle Minutes (@iIdleMinutes) AND prev ignition status = On
IF(@diffInSeconds > @iIdleMinutes AND @tempDistance < 5 AND @previous_sIgnitionStatus = 'On')
BEGIN
DECLARE @sTime VARCHAR(30)
SELECT @sTime = dbo.xPT_ConvertTimeToDDHHMMSS(@diffInSeconds,'s')
INSERT INTO @tblExcessiveIdleTime(
iAssetID,
dtIgnitionOn,
dtNextPeriodic,
sTime,
iTimeDurationInSeconds
)
VALUES
(
@current_iAssetId,
DATEADD(hour, @fGmtOffSet, @previous_dtUTCDateTime),
DATEADD(hour, @fGmtOffSet, @current_dtUTCDateTime),
@sTime,
@diffInSeconds
)
END
END
-- Set Previous Values End of Loop
SET @previous_iAssetID = @current_iAssetID;
SET @previous_sptGeoLocaitonPoint = @current_sptGeoLocaitonPoint;
SET @previous_dtUTCDateTime = @current_dtUTCDateTime;
SET @previous_sIgnitionStatus = @current_sIgnitionStatus;
END
-- increment Row Number
SET @RowCount = @RowCount + 1
END -- END OF WHILE LOOP
So after looking online AGAIN – I found we can calculate time difference between two Rows. (https://stackoverflow.com/questions/2357515/calculate-time-difference-between-two-rows)
This is what the RAW Data looks like.
I need to calculate the Time Difference and the Distance Between Each Row WHERE
- Current iAssetID = Previous iAssetID
- Previous sDigitalInputValue = '10000000'
This is the Query I came up with:
WITH rows AS
(
SELECT *, ROW_NUMBER() OVER (ORDER BY dtUTCDateTime) AS rn
FROM VehicleMonitoringLog
Where dtUTCDateTime > GetDate() - 1
--Order by iAssetId, dtUTCDateTime
)
SELECT mc.iVehicleMonitoringId as CurrentID, mp.iVehicleMonitoringId as PreviousID,
mc.iAssetId as CurrentAsset, mp.iAssetId As PreviousAsset, mc.dtUTCDateTime as CurrentTime, mp.dtUTCDateTime as PreviousTime,
DATEDIFF(second, mc.dtUTCDateTime, mp.dtUTCDateTime) AS DateDiffSeconds
FROM rows mc
JOIN rows mp
ON mc.rn = mp.rn - 1
EDIT
My Query which is working now – Please let me know if you see any performance issues with this:
SELECT dt.CurrentAsset,
dt.Distance,
dt.DateDiffSeconds,
dt.CurrentIgnition,
dt.PreviousIgnition,
ta.sReference,
ta.sCategoryName,
ta.sSiteName,
dbo.xPT_ConvertTimeToDDHHMMSS(DateDiffSeconds,'s')
FROM (
SELECT iVehicleMonitoringId AS CurrentID,
LEAD(iVehicleMonitoringId, 1) OVER (PARTITION BY iAssetID ORDER BY dtUTCDateTime) AS PreviousID,
iAssetId AS CurrentAsset,
LEAD(iAssetId, 1) OVER (PARTITION BY iAssetID ORDER BY dtUTCDateTime) AS PreviousAsset,
sDigitalInputValue AS CurrentIgnition,
LEAD(sDigitalInputValue, 1) OVER (PARTITION BY iAssetID ORDER BY dtUTCDateTime) AS PreviousIgnition,
dtUTCDateTime AS CurrentTime,
LEAD(dtUTCDateTime, 1) OVER (PARTITION BY iAssetID ORDER BY dtUTCDateTime) AS PreviousTime,
DATEDIFF(second, dtUTCDateTime, LEAD(dtUTCDateTime, 1) OVER (PARTITION BY iAssetID ORDER BY dtUTCDateTime)) AS DateDiffSeconds,
sptGeoLocaitonPoint.STDistance(LEAD(sptGeoLocaitonPoint, 1) OVER (PARTITION BY iAssetID ORDER BY dtUTCDateTime)) AS Distance
FROM VehicleMonitoringLog
WHERE dtUTCDateTime > @utcStartDate AND dtUTCDateTime < @utcEndDate
) AS dt
Inner join #tblAssets ta on ta.iAssetID = dt.CurrentAsset
WHERE CurrentIgnition = '10000000' AND Distance < 5 AND DateDiffSeconds > @iIdleMinutes
Best Answer
Your CTE-based approach with window functions is a very good start. There's another, even more suitable window function that you could use: LAG().
Here's how:
Basically,
LAG(column, n) OVER (ORDER BY x)
returns the value ofcolumn
,n
row(s) back (so n=1 returns the previous row), ordered byx
.Your CTE solution will scan
VehicleMonitoringLog
twice, then join the two streams. This query will perform only a single scan, which is much, much more efficient. TheLAG()
function (and its cousinLEAD()
) is available as of SQL Server 2012.Partitioning
It looks like you've forgotten some type of partitioning term (iAssetID, pehaps?). The partitioning term serves to separate data points between different vehicles, in case two vehicles are out-and-about at the same time. Add this partition term to the query by changing the OVER() clause from
OVER (ORDER BY dtUTCDateTime)
toOVER (PARTITION BY iAssetID ORDER BY dtUTCDateTime)
.Indexing
To make this solution really fly, I would create the following index on
VehicleMonitoringLog
:.. or