Finding Potential Primary Key In Tables

sql serversql server 2014

I have tables(more than 100) that are missing primary keys. Actually, these are legacy tables. Now as per some replication requirements, we need to add primary key on each of these tables. Is there any way to find out possible primary keys on a table (ready to purchase a commercial product)?

Best Answer

You need to define the criteria for what constitutes a "possible primary key". When you have done that, you/we can search for (for instance sp_special_columns) or produce a script that uses that criteria. Such criteria could be:

A unique index on the column(s) (possibly created using a unique constraint).
The table has a column having the IDENTITY property.
A column of the uniqueidentifier data type (but that could very well also be a foreign key).
A column or set of columns which in the data has no duplicates.

For 3 and 4, you are gambling, since having no duplicates today is not guarantee that there should be duplicates tomorrow.

In general, there is a risk messing about in a database unless you own the database. Say that you slap on an identity column on each table, make it the PK, just so you can use replication. Now, there might be some code that does SELECT * and breaks down because it now gets this additional column that you added.

Related Solutions

Sql-server – SQL Server Primary key / clustered index design decision

You are correct to separate "clustered index" from "primary key":

A clustered index is the organisation of data on disk is better if
- narrow
- numeric
- increasing (strictly monotonic)
The primary key identifies a row

Note: GUIDs make poor clustering keys

In this case, with the surrogate column, the table has 2 candidate keys:

ProductHistoryID
ProductNo + CreatedDateTime

Assumed convention states that the ProductHistoryID becomes the PK, but you can leave the PK on (ProductNo, CreatedDateTime): it will just be non-clustered. Which leads to indexes:

clustered index should be on ProductHistoryID
unique non-clustered index on (ProductNo, CreatedDateTime)

Example

CREATE TABLE Product (
    ProductHistoryID int NOT NULL IDENTITY (1,1) NOT NULL,
    ProductNo ...
    CreatedDateTime ...

then you a choice of

    CONSTRAINT PK_Product PRIMARY KEY CLUSTERED (ProductHistoryID)
    CONSTRAINT UQ_Product UNIQUE NONCLUSTERED (ProductHistoryID)

    CONSTRAINT PK_Product PRIMARY KEY NONCLUSTERED (ProductNo, CreatedDateTime)
    CONSTRAINT PK_Product UNIQUE CLUSTERED (ProductHistoryID)

Also, the pattern you have is a "type 2 Slowly Changing Dimension"

Sql-server – Why aren’t primary key / foreign key matches used for joins

In many cases, there are more than one way to join two tables; See the other answers for lots of examples. Of course, one could say that it would be an error to use the 'automatic join' in those cases. Then only a handfull of simple cases where it can be used would be left.

However, there is a severe drawback! Queries that are correct today, might become an error tomorrow just by adding a second FK to the same table!

Let me say that again: by adding columns, queries that do not use those columns could turn from 'correct' into 'error'!

That is such a maintenance nightmare, that any sane style guide would prohibit to use this feature. Most already prohibit select * for the same reason!

All this would be acceptable, if performance would be enhanced. However, that's not the case.

Summarizing, this feature could be used in only a limited set of simple cases, does not increase performance, and most style guides would prohibit its usage anyway.

Therefor it is not supprising that most database vendors choose to spend their time on more important things.

Best Answer

Related Solutions

Sql-server – SQL Server Primary key / clustered index design decision

Sql-server – Why aren’t primary key / foreign key matches used for joins

Related Question