SQL Server Indexing – Reasons to Have Both Index with Included Columns and One Without

indexnonclustered-indexsql server

I am looking through a few slow running queries on a database that has under gone significant development (under a number of different developers) over the last couple years.
One of the most commonly used tables has a couple indexes with the same indexed columns, but one will have included columns and the other will not.

For Example:
One Index:

 CREATE NONCLUSTERED INDEX [IDX_tblTable_IndexedColumnId_Inc] ON [dbo].[tblTable]
 (
[IndexedColumnId] ASC
 )

And then the other:

CREATE NONCLUSTERED INDEX [IDX_tblTable_IndexedColumnId_Inc] ON [dbo].[tblTable]
(
[IndexedColumnId] ASC
)
INCLUDE (   [Field1Id],
[Field2],
[Field3],
[Field4],
[Field5],
[Field6]) 

I would assume this is just an oversight and is just using up disk space and additional overhead to have the index without the included columns (the included columns are known to be needed).

Is there any benefit of having two indexes on the same column, one with included columns and one without?

Best Answer

To expand a little on what everyone's favorite n-dimensional analogue commented, there may be some benefit to having both.

Here's a small (but not inadequate) demo.

The necessary:

USE StackOverflow;
SET NOCOUNT ON; 

CREATE TABLE dbo.IndexCrap
(
    Id INT IDENTITY PRIMARY KEY CLUSTERED,
    Whatever DATETIME,
    Nonsense VARCHAR(50),
    Etc UNIQUEIDENTIFIER,
    SoForth BIT
);

INSERT dbo.IndexCrap WITH (TABLOCK)
        (Whatever, Nonsense, Etc, SoForth )
SELECT TOP 1000 DATEADD(DAY, x.n, GETDATE()),
                REPLICATE('A', x.n % 50),
                NEWID(),
                CASE WHEN x.n % 15 = 0 THEN 1 ELSE 0 END
FROM (SELECT ROW_NUMBER() OVER (ORDER BY @@ROWCOUNT) AS n
        FROM sys.messages AS m ) AS x;

The indexes:

CREATE INDEX ix_tinydancer ON dbo.IndexCrap (Whatever);

CREATE INDEX ix_largemarge ON dbo.IndexCrap (Whatever) INCLUDE (Nonsense, Etc, SoForth);

So like your question, on index on one column, one index on the same column with some includes.

The queries:

SELECT COUNT(*)
FROM dbo.IndexCrap AS ic;

SELECT ic.Nonsense, ic.Etc, ic.SoForth
FROM dbo.IndexCrap AS ic
WHERE ic.Whatever >= DATEADD(DAY, 500, GETDATE());

The execution plans:

For the COUNT(*) query, the optimizer chooses our smallest single column index over the wider nonclustered index and the slightly wider clustered index, since it's not filtered, and all we need is a row count.

NUTS

For the wider query, the optimzer chooses the covering index so it doesn't have to a) scan the clustered index or b) access the narrower index and do Key Lookups back to the clustered index.

NUTS

Of course, you may not know every single query that runs on your system. That's why the company I work for writes sp_BlitzIndex, a free tool to analyze your indexes.

If we run EXEC master.dbo.sp_BlitzIndex @DatabaseName = 'StackOverflow', @Mode = 4;, here are some results we're interested in:

NUTS

The two indexes I added to IndexCrap (not counting the PK/CX) get flagged as having duplicate key columns. It also gives their definitions, usage, size, and a whole lot more not shown in the screencap.

The usage counters aren't perfect. They'll reset if you add or drop indexes, in some versions of SQL Server they'll reset if you rebuild indexes, and of course they reset if you restart SQL Server.

Hope this helps!