Surrogate key / Primary key: Better to use an existing unique data field or create a key field

database-designindexprimary-keyrdbms

I am not sure if this question has been asked or not. At least I couldn't find it.

I am curious about a primary key in terms of efficiency with data searching and retrieval.

This is a hypothetical example…
I have a list of students with unique StudentID (say 10 digits long, given that the school will never have that many students) and StudentName (which is unique too).

Would it be better to use StudentID as primary key or create a new field for primary key (may be 6 – 8 digits or composite of characters and numbers. i.e. ATC1002)?

If I use StudentName as the primary key, will I see any retrieval performance degradation? If so, approximately how much in general? Are there any other factors which will have an impact on the selection of the primary field?

Best Answer

You should never assume that a data point which is outside of the control of your system will never change. This means you shouldn't assume student names won't change. There are lots of reasons in the real world why names might change. Anything that is at reasonable risk of changing is a bad candidate for a primary key. Also, names are very unlikely to be unique over a student population of any reasonable size.

Some exceptions to this might be things which are controlled by an external standards body which can be reasonably well trusted to maintain consistency. This could include something the the IATA code for an airport or the symbol for an atomic element.

Regarding the efficiency of textual (natural) keys vs. integer (surrogate) keys, there is no easy answer to this as it depends on many factors. On balance, it is fair to say that surrogate integer keys are more efficient than natural textual keys - especially if your textual values are much larger than a few characters.

Nevertheless, there are advantages to natural keys beyond raw file I/O and CPU cycles, as long as you can trust the natural keys to be stable. A school's student ID is probably a pretty good candidate because you can institute an internal policy that says student IDs are granted for life and never change.