I have a table like below: I want to get time duration for each record which SPEED is greater than 100. I written a function for that, it’s working fine , but my problem was i have millions of records these table so while i am calling these function from a stored procedure it is taking too much time.
Can you please suggest any other option for this..
Here is my table structure and data
---------------------------------------
ID SPEED TRACKTIME
---------------------------------------
1 50 2014-10-01 00:20:00
2 80 2014-10-01 00:21:00
3 110 2014-10-01 00:21:30
4 105 2014-10-01 00:22:10
5 97 2014-10-01 00:23:00
6 65 2014-10-01 00:25:00
7 107 2014-10-01 00:26:00
8 101 2014-10-01 00:27:00
9 41 2014-10-01 00:29:00
10 70 2014-10-01 00:30:00
-------------------------------------
My function:
CREATE FUNCTION [dbo].[udf_GetOverspeedMinute]
(
@vehicleID INT,
@starttime DATETIME,
@endtime DATETIME,
@speedlimit DECIMAL(18, 2)
)
RETURNS @tmp_overspeed TABLE(ID INT IDENTITY(1, 1), TimeInMinute INT)
AS
BEGIN
DECLARE @tmp_tracking TABLE (
ID INT IDENTITY(1, 1) PRIMARY KEY,
TrackingID INT,
Speed DECIMAL(18, 2),
TrackTime DATETIME
)
INSERT INTO @tmp_tracking
(
TrackingID,
Speed,
TrackTime
)
SELECT Tracking.ID,
Speed,
TrackTime
FROM Tracking WITH(NOLOCK)
WHERE VehicleID = @vehicleID
AND TrackTime BETWEEN @starttime AND @endtime
;WITH cte_speed AS (
SELECT RN = ROW_NUMBER() OVER(ORDER BY TrackTime),
*
FROM @tmp_tracking
)
INSERT INTO @tmp_overspeed
(
TimeInMinute
)
SELECT SUM(
DATEDIFF(MINUTE, [Current Row].TrackTime, [Next Row].TrackTime)
)
FROM cte_speed [Current Row]
LEFT JOIN cte_speed [Next Row]
ON [Next Row].RN = [Current Row].RN + 1
WHERE [Current Row].Speed > @speedlimit
RETURN
END
Best Answer
You don't say what DB you are using, but that syntax looks like MS SQL Server.
Also you do not properly state the table structure - the function refers to VehicleID which doesn't appear in you sample data and you don't say what the data types or what indexes and keys are present. All of this could be significant.
You are pulling a bunch if data into a table variable and processing it there which could be a problem - if there are many rows per vehicle this will hit disk and cause slowness due to that IO.
With SQL Server 2012 or later this is quite easy to do using window functions:
For this to be efficient you'll at very least need an index covering VehicleID. Better still either one covering VehicleID & TrackTime (perhaps including speed & ID), or have VehicleID & TrackTime be your clustered index (though for a definite recommendation we need to know a lot more about the other queries the table sees: picking your clustered index to optimise this query might show others).
I'd your indexes are right this should pass over the data once, with and index seek rather than a scan (or worse, a table scan) and no spooling to disk unless the data per vehicle is really massive.
If you want results for multiple vehicles then you need to tell the lead function to split by VehicleID so it becomes
LEAD(TimeTrack) OVER (PARTITION BY VehicleID ORDER BY TimeTrack)
, and add VehicleID to the main ordering clause too.