Sql-server – Clustered vs Non Clustered Indexes for a Primary Key

clustered-indexnonclustered-indexprimary-keysql serveruniqueidentifier

I have a large table with around 19 million records in it.

The primary key is a uniqueidentifier column called ID and there is a nonclustered index on this column. An application generates these GUID values.

The table currently has no clustered index on it.

I am considering changing from a nonclustered index to a clustered index, since the table currently has no clustered index. I am wondering what are the pros and cons of doing this or if I should?

  1. Leave things as is
  2. Find another column to create the clustered index on
  3. Change this ID column over from a non clustered index to a clustered index.

We are getting about 50K inserts daily, with no deletes and no updates. Let me know if you need any additional details.

The table has 24 columns total and 5 nonclustered indexes without a clustered index:

  • Index 1- PK on ID (uniqueidentifier)
  • Index 2- TimeStamp (datetime)
  • Index 3- Hash Column (nvarchar(64))
  • Index 4- Covering index 4 columns
  • Index 5- Covering index 6 columns

I inherited the database from someone else. The application uses this GUID as a primary key with relationships to matching foreign keys in respective child tables so the queries ​use the uniqueidentifier id's in joins.

How do I determine if my GUID is generated sequentially? They don't appear similar. Some examples are as follows ordered by createddate:

24A689A0-831E-4670-9204-766256C58E43
ED001C69-11BA-430D-ACF9-3C3CADF3B400
768E5497-4F48-4C65-87D9-E9C27066708E

Doing a little research, it seems like a uniqueidentifier column is not a good candidate for a clustered index, so I'm wondering if I need to:

  1. Look to a few columns to create a composite clustered index on these columns or
  2. Live without a clustered index
  3. Something else?

Let me know if I can provide any additional details. If anyone can give me some guidance as to what are some considerations for making this decision, I would appreciate that greatly.

Best Answer

Unless you switch to sequential GUID generation, you don't want the GUID as a clustered index key. So your choices are

1) Make the ID column the clustered index key. This will optimize joins and lookups on ID, but it sounds like those are rare.

2) For SQL 2016+ replace the Heap with a Clustered Columnstore index. This compresses the data and optimizes table scans, perhaps to the extent that you can drop some of your other indexes.

3) Pick your most used non-clustered index and make that the Clustered index.

4) Leave it as a Heap.

In any case, the "child tables" that use the GUID as a foreign key might benefit from having that foreign key column as the leading column in their clustered primary key.