Sql-server – Add a primary key constraint to an existing unique column

constraintprimary-keysql server

I've seen lots of information around adding a new primary key column to a large database table, but I have a somewhat different situation.

I have a large table (~90,000,000 records), that has no primary key. However, it DOES have a "RowID" column, which is an identity column. It also has a unique index on the RowID column.

All my research showed that a table without a primary key is terrible design that should be gotten rid of if at all possible. In my particular case; it's causing issues because Entity Framework maps the table as a read only view if it can't find a primary key.

I want to add a primary key constraint to the table on the RowID column. I was hoping that because it's not adding a new column, and there's already a unique index on that column, that it would be a small operation; however, it appears from a test run that it will take a very long time.

Having this table be inaccessible for a long time will cause issues across our business. I know that another option is to create a new table with the new schema, and copy all data from the old to the new. But this will also cause us to need to lock down the table for a long time.

Is there a way to more quickly add the primary key constraint when you already have an identity column with a unique index?

Best Answer

The assumption you have is that a lock will occur when creating a new table. This is false if you use the correct query hints & have a good action plan. I recommend the following:

  1. Create a New table with identical schema but with a primary key constraint added.
  2. INSERT INTO NewTable SELECT * FROM OldTable (this will still take a while)
  3. Use SSIS or write a good function to identify new rows in the old table whilst the insert into the newtable was occuring and write these rows to the new table. (Essentially perform another merge ontop)
  4. Rename/Drop the old table and rename the new table to the old table name.

On the topic of locks. You have a few options as I see:

  1. Specify NOLOCKS or READUNCOMITTED on your query. This has its own implications that are best read here.
  2. Take a full table lock which you've already said is less desirable but would mean step 3 of the action plan is not required.
  3. Take shared page locks (Default for select). This means that the individual pages when being read will lock and no updates or inserts will be allowed. It does mean though that other selects on this page can still occur during this lock.

It really depends on your specific need which of these three is your best COA. I suggest consulting with business to determine if an outage is really such a bad thing and if it is, consult with the developers to determine what lock level is most appropriate for the task.