I've been asked to improve the performance of the following query.
I can't amend any existing Indexes but I can create new ones.
The table structures can be found in this SQL Fiddle
The row sizes of the tables are ( from development – query plans and statistics etc will be provided from development)
dbo.Logs --6620130
dbo.LogMobiles --6620082
For reference, the table sizes in Production are
dbo.Logs --271856036
dbo.LogMobiles --269611865
I updated the statistics on the tables before running the query like so
UPDATE STATISTICS dbo.Logs ;
UPDATE STATISTICS dbo.LogMobiles ;
DBCC UPDATEUSAGE(db1, 'dbo.Logs');
DBCC UPDATEUSAGE(db1, 'dbo.LogMobiles');
The query is as follows
DECLARE
@rpStartDateTime DATETIME = '20180101'
,@rpEndDateTime DATETIME = '20180131' ;
DECLARE
@iStart INT = DATEDIFF(SECOND, '19700101', @rpStartDateTime)
,@iEnd INT = DATEDIFF(SECOND, '19700101', @rpEndDateTime);
SELECT
ROW_NUMBER()OVER(PARTITION BY L.NodeId ORDER BY L.NodeId, L.GPSTime ASC)
, V.NodeId
, V.UnitMaxSpeed
, M.DriverNodeId1
, M.StatusText
, M.Speed
, ISNULL(S.StreetMaxSpeed,200)
FROM
#xMobiles V
INNER JOIN
dbo.Logs L ON L.NodeId = V.NodeId
INNER JOIN
dbo.LogMobiles M ON M.LogId = L.LogId
--Must have Drivers identifiable
INNER JOIN
#xDrivers DN ON DN.DriverNodeId = M.DriverNodeId1
LEFT JOIN
dbo.Streets S ON S.StreetId = M.StreetId
WHERE
L.GPSTime >= @iStart AND L.GPSTime <= @iEnd
ORDER BY L.NodeId, L.GPSTime
OPTION (MAXDOP 1) ;
The LogId
from dbo.LogMobiles
is a FK
that references LogId
(PK) from dbo.Logs
.
Here is the Query Plan as is with no additional indexes
The Logical reads are as follows
From the Plan, the number of Actual Rows from dbo.Logs is 230768 which matches the final output. The number of Actual Rows from dbo.LogMobiles (from a Clustered Index Scan) is 5676246, this is 24x greater.
Time taken to execute ~ 4 seconds.
There should be a one2one relationship between dbo.Logs and dbo.LogMobiles (although other functionality do log rows to dbo.Logs) and as you can see the big difference between the number of logical reads and actual number of rows retrieved from these tables.
I then added the following Index on dbo.LogMobiles
CREATE NONCLUSTERED INDEX IX_LogId_SBR ON dbo.LogMobiles ( DriverNodeId1 ASC, LogId ASC ) INCLUDE ( Speed, StatusText, StreetId )
This is Plan with the new Index
This reduced the number of logical reads on dbo.LogMobiles
But increased the actual number of rows returned on the Index scan to 6620082 ( 28x greater).
Time taken to execute ~ 4 seconds – same.
What should I try next?
Best Answer
I am an Oracle Developer. Based on my understanding of Oracle performance tuning, this is how I would approach your situation.
I am seeing three entry points to your SQL.
Each entry point will produce a different execution plan. Based on the selectivity of rows upfront in the execution plan tree, the best plan needs to be selected dynamically. Go through the data volume that gets generated at each step for the three possible execution plans. That will give you a good insight into the fact that "which plan suits your particular scenario".
Regarding your first execution plan, you have stated that the number of Actual Rows from dbo.Logs is 230768 and the Clustered Index(CI) Scan is showing 5676246 (24x greater).
Please note the difference between "SEEKs" and "SCANs". IX_Log_NodeID1 was accessed via an Index Seek. This uses the predicate values passed for that index and seeks only those rowids matching the predicate values. It does not scan the entire index. Whereas, in step two of the execution plan, the entire CI PK_LogMobiles is scanned into memory and only then a nested loop is performed. This is not optimal.
That is our issue. In step two PK_LogMobiles must be accessed via CI Seek, seeking only the 230768 logIDs out of the entire 5676246 logIDs. Instead the entire CI is scanned.
In Oracle such scenarios arise when the 230768 rows are scattered across too many pages (SQLServer) or blocks (Oracle) causing a major part / the entire CI to be scanned. This means that the records are not clustered enough for seeks. Improve the ClusteringFactor (Oracle Terminology) of your so called CI to resolve your performance issue.