Sql-server – Improve performance of query using IN()

filtered-indexperformancequery-performancesql-server-2008-r2

I have the following SQL query:

SELECT
  Event.ID,
  Event.IATA,
  Device.Name,
  EventType.Description,
  Event.Data1,
  Event.Data2
  Event.PLCTimeStamp,
  Event.EventTypeID
FROM
  Event
INNER JOIN EventType ON EventType.ID = Event.EventTypeID
INNER JOIN Device ON Device.ID = Event.DeviceID
WHERE
  Event.EventTypeID IN (3, 30, 40, 41, 42, 46, 49, 50)
  AND Event.PLCTimeStamp BETWEEN '2011-01-28' AND '2011-01-29'
  AND Event.IATA LIKE '%0005836217%'
ORDER BY Event.ID;

I also have an index on the Event table for the column TimeStamp. My understanding is that this index is not used because of the IN() statement. So my question is is there a way to make an index for this particular IN() statement to speed up this query?

I also tried adding Event.EventTypeID IN (2, 5, 7, 8, 9, 14) as a filter for the index on TimeStamp, but when looking at the execution plan it doesn't appear to be using this index. Any suggestions or insight into this would be greatly appreciated.

Below is the graphical plan:

Execution Plan

And here is a link to the .sqlplan file.

Best Answer

Given tables of the following general form:

CREATE TABLE Device 
(
    ID integer PRIMARY KEY
);

CREATE TABLE EventType
(
    ID integer PRIMARY KEY, 
    Name nvarchar(50) NOT NULL
);

CREATE TABLE [Event]
(
    ID integer PRIMARY KEY, 
    [TimeStamp] datetime NOT NULL, 
    EventTypeID integer NOT NULL REFERENCES EventType, 
    DeviceID integer NOT NULL REFERENCES Device
);

The following index is useful:

CREATE INDEX f1 
ON [Event] ([TimeStamp], EventTypeID) 
INCLUDE (DeviceID)
WHERE EventTypeID IN (2, 5, 7, 8, 9, 14);

For the query:

SELECT
  [Event].ID,
  [Event].[TimeStamp],
  EventType.Name,
  Device.ID
FROM
  [Event]
INNER JOIN EventType ON EventType.ID = [Event].EventTypeID
INNER JOIN Device ON Device.ID = [Event].DeviceID
WHERE
  [Event].[TimeStamp] BETWEEN '2011-01-28' AND '2011-01-29'
  AND Event.EventTypeID IN (2, 5, 7, 8, 9, 14);

The filter meets the AND clause requirement, the first key of the index allows a seek on [TimeStamp] for the filtered EventTypeIDs and including the DeviceID column makes the index covering (because DeviceID is required for the join to the Device table).

Finished plan

The second key of the index - EventTypeID is not strictly required (it could also be an INCLUDEd column); I have included it in the key for the reasons stated here. In general, I advise people to at least INCLUDE columns from a filtered index WHERE clause.


Based on the updated query and execution plan in the question, I agree that the more general index suggested by SSMS is likely the better choice here, unless the list of filtered EventTypeIDs is static as Aaron also mentions in his answer:

CREATE TABLE Device 
(
    ID integer PRIMARY KEY,
    Name nvarchar(50) NOT NULL UNIQUE
);

CREATE TABLE EventType
(
    ID integer PRIMARY KEY, 
    Name nvarchar(20) NOT NULL UNIQUE,
    [Description] nvarchar(100) NOT NULL
);

CREATE TABLE [Event]
(
    ID integer PRIMARY KEY, 
    PLCTimeStamp datetime NOT NULL,
    EventTypeID integer NOT NULL REFERENCES EventType, 
    DeviceID integer NOT NULL REFERENCES Device,
    IATA varchar(50) NOT NULL,
    Data1 integer NULL,
    Data2 integer NULL,
);

Suggested index (declare it unique if that is appropriate):

CREATE UNIQUE INDEX uq1
ON [Event]
    (EventTypeID, PLCTimeStamp)
INCLUDE 
    (DeviceID, IATA, Data1, Data2, ID);

Cardinality information from the execution plan (undocumented syntax, do not use in production systems):

UPDATE STATISTICS dbo.Event WITH ROWCOUNT = 4042700, PAGECOUNT = 400000;
UPDATE STATISTICS dbo.EventType WITH ROWCOUNT = 22, PAGECOUNT = 1;
UPDATE STATISTICS dbo.Device WITH ROWCOUNT = 2806, PAGECOUNT = 28;

Updated query (repeating the IN list for the EventType table helps the optimizer in this specific case):

SELECT
  Event.ID,
  Event.IATA,
  Device.Name,
  EventType.Description,
  Event.Data1,
  Event.Data2,
  Event.PLCTimeStamp,
  Event.EventTypeID
FROM
  Event
INNER JOIN EventType ON EventType.ID = Event.EventTypeID
INNER JOIN Device ON Device.ID = Event.DeviceID
WHERE
  Event.EventTypeID IN (3, 30, 40, 41, 42, 46, 49, 50)
  AND EventType.ID IN (3, 30, 40, 41, 42, 46, 49, 50)
  AND Event.PLCTimeStamp BETWEEN '2011-01-28' AND '2011-01-29'
  AND Event.IATA LIKE '%0005836217%'
ORDER BY Event.ID;

Estimated execution plan:

Second plan

The plan you get will likely be different because I am using guessed statistics. The general point is to give the optimizer as much information as you can, and provide an efficient access method (index) on the 4-million row [Event] table.