I'm trying to improve the performance of a specific query that is automatically generated by Entity Framework. I have run the query through SSMS, and it has suggested creating two missing indexes. The table in question:
CREATE TABLE [dbo].[PackageEvents]
(
[EventID] [int] NOT NULL IDENTITY(1, 1),
[PackageID] [char] (24) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL,
[EventDescription] [varchar] (50) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL,
[EventDate] [datetime] NOT NULL,
[UserName] [varchar] (50) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL,
[Notes] [varchar] (max) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL,
[IsSynchronized] [bit] NOT NULL CONSTRAINT [DF_AmazonPackageEvents_IsSynchronized]
DEFAULT ((0)),
[LastSyncDate] [datetime] NULL,
[Version] [timestamp] NOT NULL
)
SSMS has suggested the following two indexes:
CREATE NONCLUSTERED INDEX [IX_IsSynchronized] ON [dbo].[PackageEvents]
([IsSynchronized]) INCLUDE ([PackageID])
CREATE NONCLUSTERED INDEX [IX_Covering] ON [dbo].[PackageEvents] ([PackageID])
INCLUDE ([EventDate], [EventDescription], [EventID], [IsSynchronized], [LastSyncDate],
[Notes], [UserName], [Version])
I'm not posting the query I'm optimizing for because it is horrendous (generated by Entity Framework) and is almost unreadable. In general the query is looking for any rows where IsSynchronized = 0 and returning those rows.
Is there a way to combine these two indexes into one index that would provide the same or better performance benefit? Is this question impossible to answer without the exact query?
EDIT: The only existing index is the clustered index on the primary key, EventID.
I searched through the generated Entity Framework query. The following where clause occurs 4 times, but always in the same form:
WHERE EXISTS (SELECT
1 AS [C1]
FROM [dbo].[AmazonPackageEvents] AS [Extent46]
WHERE ([Project30].[PackageID] = [Extent46].[PackageID]) AND
(0 = [Extent46].[IsSynchronized])
)
Which seems to boil down to
WHERE PackageID=@PackageID AND IsSynchronized=0
Best Answer
Well, you could consider a filtered index - if you're always looking for rows where
IsSynchronized = 0
and this number should be relatively small, then instead of those two indexes, consider this instead:Of course you may want to make that even smaller and test to see the difference in impact if the query has to look up the data (should be pretty efficient if the number of rows is small), so - assuming
PackageID
is the clustering key:The overhead of maintaining this index may very well be worth the space savings compared to a full index, especially if it's only being used to optimize this query (or query pattern, at least).
Filtered indexes are not magic, though; JNK brought up some limitations below:
Also you'll want to read these posts:
If you don't want to use a filtered index, you can probably test variations of these:
(For a long time I thought that including BIT columns in the key was wasteful but Martin Smith demonstrated a case where it worked quite well - worth a try. I can't find the post now.)
Without your full schema, data, query patterns etc. we can only guide you and have you test our suggestions in your environment. We can't say, "Ding! This is the one that will work for you!"