Modify legacy table schemas that have no primary key

performanceprimary-keyschema

I've been tasked with updating our database (SQL Server 2008) so that every table in it has a proper primary key defined. Due to various reasons over the years we have somehow built up some legacy tables which have no primary key at all and my colleague wants everything to have a key as then he can use a database comparison tool when deploying changes plus he wants indexes available for everything.

I would agree that having a primary key defined for every table is a must but at this point I'm wondering how much benefit it would provide. Some tables with missing keys are very large, some containing millions of records, and some are simply there to store data; insert often but get the data back not so much. Some also appear to have duplicate data in them, meaning a laborious task of cleaning up the data first, and possibly have no unique key available unless every field in the table was used.

So what I would like to know is:

  1. Is it worth having a key on a table even if it means making a composite key of every table field? Or perhaps an auto-incrementing number would be more beneficial instead?
  2. Is there going to be a performance cost to adding keys to tables which already contain a vast amount of data? For example if a clustered index was going to be added that would mean more space taken up to store the index too; on a table which is basically used for reference data is that worth it?

Best Answer

I would add a BIGINT Identity column as the non-clustered primary key for any table that does not have a primary key. This will cause minimal disruption and use minimal storage space.

It will often improve performance especially for updates and deletes.