I wanted to see the effects of having indexes on calculated columns, so I created a table like so:
CREATE TABLE [Domain\UserName].[CompColIndexing](
[a] [int] NOT NULL,
[nonIndexedNonPersisted] AS ([a]+(1)),
[nonIndexedPersisted] AS ([a]+(1)) PERSISTED,
[IndexedNonPersisted] AS ([a]+(1)),
[IndexedPersisted] AS ([a]+(1)) PERSISTED
) ON [DATA]
I've added 800,000
rows to this, with the value for a
cycling through 0
to 9
.
The following indexes were added:
CREATE NONCLUSTERED INDEX [IX_DJB_CompNonPersisted] ON [Domain\UserName].[CompColIndexing]
(
[IndexedNonPersisted] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [DATA]
GO
CREATE NONCLUSTERED INDEX [IX_DJB_CompPersisted] ON [Domain\UserName].[CompColIndexing]
(
[IndexedPersisted] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [DATA]
GO
Than I ran some ORDER BY
clauses to see what performance differences I would get, plannign afterwards to see how changing values of a
would affect things.
SELECT *
FROM [EMEA\BanksD].[CompColIndexing]
ORDER BY a
SELECT *
FROM [EMEA\BanksD].[CompColIndexing]
ORDER BY nonIndexedNonPersisted
SELECT *
FROM [EMEA\BanksD].[CompColIndexing]
ORDER BY IndexedNonPersisted
SELECT *
FROM [EMEA\BanksD].[CompColIndexing]
ORDER BY nonIndexedPersisted
SELECT *
FROM [EMEA\BanksD].[CompColIndexing]
ORDER BY IndexedPersisted
Unexpectedly though, I find that I get exactly the same result for each of the queries:
I was at least expecting the SORT
operation on the first query to be slower as that one is unindexed.
What's happening here?
The cardinality is low on purpose, in reality, I actually need to sort on three different values.
I'm using Microsoft SQL Server 2008 R2 (SP2) - 10.50.4042.0 (X64)
Actual execution plans available at: https://www.brentozar.com/pastetheplan/?id=S10MTuxGg
Best Answer
The table definition that you have is leading to some really odd optimizer behavior. I suspect that you're running into the issue documented in this SE post. To avoid that issue I'm going to create the table with just the [a] and the [IndexedPersisted] columns.
Query hints can be useful to figure out why the optimizer didn't pick the plan that you expected. Here you expected the index to be used but SQL Server did not use it. Let's view both query plans side by side:
The query optimizer thinks that the sort after the table scan is cheaper than 800000 RID lookups. Maybe it's wrong, so let's run the queries and compare performance metrics for them.
I got those numbers by looking at sys.dm_exec_sessions after running the queries in separate sessions with "discard results after execution" checked so I wouldn't have to wait on the rows to be returned to the client.
Those numbers seem reasonable to me. Just because an index can be used does not mean that it should be used, especially if SQL Server will need to read the entire table by using the index. That's the worst use case for an index that I can think of. Indexes can be very useful when selecting a small percentage of rows from the table or when they are covering indexes.
The index is a covering one if I only select the [IndexedPersisted] column. In that case, SQL server thinks that using the index is cheaper than doing a table scan. Code to compare the two methods:
Here are the performance numbers:
Now that the index is a covering index it is a better access path for the table than a table scan.