Primary keys: composite and singular

database-design

In general when I design tables I always include a single unique field, even if it there is already a unique combination of columns available to create a primary key. This field is usually either a UUID (if application-generated) or a sequence (if database-generated).

Some questions:

are there any downsides to this approach (apart from storage)?
the sole exception to this rule is (for me) the join table that exists only to join other tables. Is there an argument to be made that a join table should also have a single field acting as primary key?

Best Answer

The benefits of using a surrogate key (that's your newly created single field), in my opinion, outweigh the downsides in most cases.

But there are a few cases that you may experience some drawbacks.

Think of a company's employees that work different shifts on different days of the week. Here's a simplified database:

SHIFT(ShiftID, Shift, SomeOtherColumn);
EMPLOYEE(EmployeeID, FirstName, LastName);
[TRANSACTION](TransactionID, ShiftID, EmployeeID, Date);

So, as constructed, we've created a surrogate key for each table. If we need to, say, find out what shift correspond to a particular shiftID from a particular transactionID, we'd have to run a second query. That's the downside of using a surrogate key in that case.

If the fields Shift and SomeOtherColumn from the SHIFT table can be assumed to be a candidate key (a composite primary key, in other words), then we would have to declare both fields in the our FOREIGN KEYconstraint DDL when we created the [TRANSACTION] table. So, we would be able to directly query the actual Shift from the [TRANSACTION] table without having to run an additional query.

Another issue with using a catch-all surrogate key is that it has no intrinsic value. So, if you work for a larger organization, but there are different databases (in other departments, for examples) that refer to the same items with different primary keys (surrogate keys), you may have a bit of an issue if you have join their data regularly.

Ditto for the developer who has 2 or 3 apps, maintains a different user database for each, with a different surrogate ID for each user but may have a ton of duplicate users between them.

Personally, I go on a case by case basis, but lean toward using a surrogate key more unless there is an outstanding reason I should not.

Best Answer

Related Solutions

Mysql constraint problem

Sql-server – SQL Server : primary keys advice to the whitepaper needed

Related Question