Sql-server – Finding the row with the latest date for each row type

azure-sql-databaseoptimizationsql serversql server 2014

I've got an SQL database (SQLAzure V12 version running on Azure) that looks like the following:

DECLARE @VehicleData TABLE (
    id int PRIMARY KEY,
    Registration nvarchar(50) NOT NULL,
    [DateTime] DateTime NOT NULL,
    [Event] nvarchar(20) NOT NULL,
    [Data] nvarchar(200) NULL,
    INDEX IX1 CLUSTERED(Registration, Datetime),
    INDEX IX2 NONCLUSTERED(Registration, Event, DateTime)
    )

An auxiliary table looks like this:

DECLARE @EventList TABLE (
    id int PRIMARY KEY,
    [EventName] nvarchar(20) NOT NULL
    )

The table is filled with location data taken from tracking units on multiple vehicles. Each row might have a particular [Event] type, e.g. Update, HarshBraking, Delivery etc.

I have a query that needs to get the latest date for each event for a particular vehicle BEFORE A SPECIFIED DATE.

I'm using the following query to do that:

@Registration nvarchar(20),
@DateTime datetime

SELECT V.Registration, V.DateTime, V.Event, V.Data
    FROM VehicleData V
    INNER JOIN (
        SELECT Registration, MAX(DateTime) LatestDate, Event
        FROM VehicleData
        WHERE Registration = @Registration AND DateTime < @DateTime
        AND Event in (SELECT Name FROM EventList) 
        GROUP BY Registration, Event ) AS L
    ON V.Registration = L.Registration AND V.Event = L.Event AND V.DateTime = L.LatestDate

Note the line AND Event in (SELECT Name FROM EventList) was added later as it forced the query to use the IX2 index, which seemed to improve the query speed.

The query takes some time to run (around 10 seconds or so) and I would ideally like to see if I can optimise it further to take the speed down to less than a second (as it is tied to a GUI action).

How can I optimise this query further?

Edit You can download the execution plan from http://pastebin.com/9nVZzYPL. Just paste the contents into a .sqlplan file and then load it into SSMS to view the plan. Query 1 is the only thing of interest here. The other queries are "post processing" operations that don't contribute much to the overall query time.

Best Answer

How about using windowing functions?

WITH Numbered AS
(
SELECT *, ROW_NUMBER() OVER (PARTITON BY Event ORDER BY DateTime DESC) AS rownum
FROM VehicleData
WHERE Registration = @Registration 
AND DateTime < @DateTime
)
SELECT V.Registration, V.DateTime, V.Event, V.Data
FROM Numbered V
WHERE rownum = 1;

Make sure you have an index on: (Registration, Event, DateTime) INCLUDE (Data)

You can use your extra predicate on EventList too if you like, of course.