Does it ever make sense to have a clustered and non clustered index for the same column

indexindex-tuning

I inherited a database and a web service that goes with it.

Looking at the indexes for duplicates I see an two indexes that point at the same column (1 column and sorted the same way). One is clustered and one is not, that is the only difference I see.

Does it ever make sense to keep both of these?
If I delete one which one should I delete

Best Answer

Before addressing the two questions, let me briefly define the difference between clustered & non-clustered indexes.

The clustered index is organized by the key columns. It also includes every other column as part of the row structure (ie, it has the entire row).

The nonclustered index is also organized by the key columns. It implicitly includes the clustering key columns (if the table is clustered), or a pointer to the row (if the table's a heap). If any INCLUDE columns are explicitly specified, they will also be included in the index structure.

Does it ever make sense to keep both of these?

There is a corner case where it makes sense to have a non-clustered index "duplicating" the clustered index. If you have a query that frequently scans the table, and ONLY makes use of the clustering key column, the query optimizer will prefer to use the non-clustered index. The non-clustered index does not contain the full row data, and thus it will take up less physical space. Because it takes up less physical space, SQL Server can scan the table with fewer IOs, and will make use of it for performance reasons.

You can query sys.dm_db_index_usage_stats, you can see the number of user_seeks and user_scans on the two indexes. This will help you see how the two indexes are being used, and determine the usefulness of the two indexes.

If I delete one which one should I delete?

As a general rule, most tables should have a clustered index. (That's a whole other topic). Not knowing any details about your index usage or data access, I'd guess that if you were to drop one, you would drop the nonclustered index (but, that really is a guess based on what I know).

Depending on the clustering key, the data, the data access patterns, etc., the most correct answer might be to drop the clustered index and create a different clustered index. You may want to read more on Effective Clustered Indexes.

Related Solutions

Sql-server – Question about non-clustered index storage in SQL Server

Nonclustered indexes always include a row locator.

For a heap this will be an 8 byte RID (File:Page:Slot). On a table with a clustered index it will be the clustered index key column(s). And it will always be the copied values not a pointer to the values. This duplication of CI key values into all non clustered indexes is why it is often recommended that the CI key be narrow and not frequently updated.

In the table shown in the question the Clustered index key is a 4 byte integer and potentially may also include a 4 byte uniqueifier for any duplicate key values.

In your case as the NCIs are not declared as unique the CI key will be appended to the NCI key.

For unique non clustered indexes the CI key would be added as included column(s) in the leaf pages unless explicitly made part of the key.

See Kalen Delaney: More About Nonclustered Index Keys for some additional information about how you can see this for yourself.

With these 4 rows of data all three indexes only consume a single 8KB data page.

SELECT index_id,
       index_level,
       page_count,
       record_count
FROM   sys.dm_db_index_physical_stats(DB_ID(), OBJECT_ID('people'), NULL, NULL, 'DETAILED')

Returns

+----------+-------------+------------+--------------+
| index_id | index_level | page_count | record_count |
+----------+-------------+------------+--------------+
|        1 |           0 |          1 |            4 |
|        2 |           0 |          1 |            4 |
|        3 |           0 |          1 |            4 |
+----------+-------------+------------+--------------+

The additional page shown in use by sys.allocation_units.total_pages is an IAM page. This is not used for storing data but just for tracking the pages and extents comprising the index.

Sql-server – Duplicate non-clustered index on Primary Key, can I drop one of them

I think it's safe to remove the non-primary key as it's an exact duplicate of the primary key.

If one of them is used more and takes less space, did you check the fragmentation and last statistics date on both? Maybe one of them is more fragmented or has better stats.

Best Answer

Related Solutions

Sql-server – Question about non-clustered index storage in SQL Server

Sql-server – Duplicate non-clustered index on Primary Key, can I drop one of them

Related Question