Sql-server – Non-Unique Clustered Indexes Administration

fragmentationindexperformancesql serversql-server-2016

I have been recently assigned to administer a database in SQL Server 2016 and I have discovered that there are many tables in the database with Non-Unique Clustered indexes resulting in Non-Clustered Primary Key indexes.

I know that the above is apparently allowed but is not ideal (specially for large tables) and maybe for some few cases, that would make sense, but the number of such tables tells me that this was more like a result of careless table definitions.

For example, a table named Transaction will have a Clustered Index with the following keys: date, is_deleted. Also, the table will have a Non-Clustered Primary key on a column named id.

Now, some of these tables are pretty large and have many foreign key references on them, so I came up with the below steps to change all these tables with the help of Aaron's answer in this thread Unable to drop non-PK index because it is referenced in a foreign key constraint:

  1. Drop the Clustered Index of the table
  2. Drop all foreign key constrains that reference the table
  3. Drop current Primary Key constraint of the table
  4. Create Primary Key Clustered constraint using the id column
  5. Create all foreign key constraints that were previously deleted
  6. Optionally – Create the previously Clustered index as a Non-Clustered index

I know that for the large tables I need to find a maintenance window to perform the change but do you find any problems or gotchas with the above solution what I didn't think of? Will this method cause fragmentation on the rest of Non-Clustered Indexes?

There are a lot of tables involved and I want to make sure that I'm not triggering any side effects.

Best Answer

This one is hard to answer without knowing your query patterns, but you will most definitely cause side effects when dropping all clustered indexes and replacing them with something else.

I also don't think you are necessarily correct in assuming that having a non-unique clustered index is sloppy table design or by definition sub-optimal.

When you add a non-unique clustered index, SQL Server will add a 4-byte uniqueifier to the index to make sure the key is unique if there are duplicate values in the key. Calculating this uniqueifier adds a bit of overhead but nothing too dramatic. This uniqueifier also has to be in the non-clustered index since the index bookmark has to point to a single row.

On the other hand, the example you give could be a good clustered index depending on the situation:

For example, a table named Transaction will have a Clustered Index with the following keys: date, is_deleted. Also, the table will have a Non-Clustered Primary key on id column.

This could be a very good design decision if there are a lot of transactions on the same date, and almost every query is a SELECT * FROM transactiontable WHERE date = <somedate> AND is_deleted=0.

A query like that could hurt badly if it has to use a non-clustered index and resort to key lookups.

It's impossible to say what side effects you will trigger, but you'll surely trigger some, and your reasons for this endeavour might be the wrong ones.

If you insist on doing this, make sure you test it first in a testing environment, and if you do you should be able to tell if this method works and if it will cause fragmentation or not (but that will probably be the least of your worries).