Sql-server – SQL Server Primary key / clustered index design decision

database-designsql server

Looking for some advice regarding a table / index design decision I've got to make on some tables that I've got to port into SQL server from an existing 4GL based database.

I've got a product history table that is inserted into frequently (never updated) and the table has this kind of structure

ProductNo String(20)
CreatedDateTime DateTime
Description String(100)

At the moment the primary key is made up of a combination of ProductNo and CreatedDateTime in an attempt to define a unique index key. We can have many records per productno.

I'll be creating some 1 to 1 related tables and don't want to carry both the productno and the createddatetime fields into the related tables to act as foriegn keys. I also think this combination is a little fragile in order to guarantee uniqueness.

So, I'm planning to add a new field to the table 'ProductHistoryPK' as an incrementing Int or SequentialGuid to act as the primary key and a foreign key to related tables.

In terms of indexes I'm thinking of creating

Non-clustered primary key on the new ProductHistoryPK field.
Clustered Index on the ProductNo field as this is field that is
often searched on.

Any thoughts or pointers regarding this?

Thanks…

Best Answer

You are correct to separate "clustered index" from "primary key":

A clustered index is the organisation of data on disk is better if
- narrow
- numeric
- increasing (strictly monotonic)
The primary key identifies a row

Note: GUIDs make poor clustering keys

In this case, with the surrogate column, the table has 2 candidate keys:

ProductHistoryID
ProductNo + CreatedDateTime

Assumed convention states that the ProductHistoryID becomes the PK, but you can leave the PK on (ProductNo, CreatedDateTime): it will just be non-clustered. Which leads to indexes:

clustered index should be on ProductHistoryID
unique non-clustered index on (ProductNo, CreatedDateTime)

Example

CREATE TABLE Product (
    ProductHistoryID int NOT NULL IDENTITY (1,1) NOT NULL,
    ProductNo ...
    CreatedDateTime ...

then you a choice of

    CONSTRAINT PK_Product PRIMARY KEY CLUSTERED (ProductHistoryID)
    CONSTRAINT UQ_Product UNIQUE NONCLUSTERED (ProductHistoryID)

    CONSTRAINT PK_Product PRIMARY KEY NONCLUSTERED (ProductNo, CreatedDateTime)
    CONSTRAINT PK_Product UNIQUE CLUSTERED (ProductHistoryID)

Also, the pattern you have is a "type 2 Slowly Changing Dimension"

Related Solutions

Sql-server – Is ‘Avoid creating a clustered index based on an incrementing key’ a theth from SQL Server 2000 days

The myth goes back to before SQL Server 6.5, which added row level locking. And hinted at here by Kalen Delaney.

It was to do with "hot spots" of data page usage and the fact that a whole 2k page (SQL Server 7 and higher use 8k pages) was locked, rather then an inserted row Edit, Feb 2012

Found authoritative article by Kimberly L. Tripp

"The Clustered Index Debate Continues..."

Hotspots were something that we greatly tried to avoid PRIOR to SQL Server 7.0 because of page level locking (and this is where the term hot spot became a negative term). In fact, it doesn't have to be a negative term. However, since the storage engine was rearchitected/redesigned (in SQL Server 7.0) and now includes true row level locking, this motivation (to avoid hotspots) is no longer there.

Edit, May 2013

The link in lucky7_2000's answer seems to say that hotspots can exist and they cause issues. However, the article uses a non-unique clustered index on TranTime. This requires a uniquifier to be added. Which means the index in not strictly monotonically increasing (and too wide). The link in that answer does not contradict this answer or my links

On a personal level, I have woked on databases where I inserted tens of thousands of rows per second into a table that has a bigint IDENTITY column as the clustered PK.

Sql-server – Will a primary key be added as a clustered index

Yes, SQL Server will create the primary key as clustered by default, but you don't have to accept the defaults.

ALTER TABLE dbo.foo 
  ADD CONSTRAINT pk PRIMARY KEY (bar); -- clustered

ALTER TABLE dbo.foo 
  ADD CONSTRAINT pk PRIMARY KEY CLUSTERED (bar); -- clustered

ALTER TABLE dbo.foo 
  ADD CONSTRAINT pk PRIMARY KEY NONCLUSTERED (bar); -- non-clustered

And yes, you will see some I/O activity here, so if it is a busy system, best to save this for quieter hours or a maintenance period.

Best Answer

Related Solutions

Sql-server – Is ‘Avoid creating a clustered index based on an incrementing key’ a theth from SQL Server 2000 days

Sql-server – Will a primary key be added as a clustered index

Related Question