Sql-server – When to use identity increment

identitysql server

Many times I face these three choices to work with IDs:

  • Using an auto-increment identity column?
  • Suggesting a new codification to use (eg, for students: first 4 position is the year when the student joins the university, 5th position is 0 for females 1 for males, the last 4 position is a sequential number)
  • Don't add any extra column and use only a unique information that already exists (eg: name (of course must be sure that's unique (there are no two possible entries with the same name)))

Question is: which choice do I make, and according to what criteria?

Best Answer

This is an ongoing debate that many face, and experts will give good arguments for either of these. What your question boils down to is a surrogate vs. natural key (and option two seems to be a half-surrogate half-natural key).

As for your third option, oftentimes I see people do whatever they can to find a unique candidate for a primary key constraint. Take you for instance: name is your example key. That is a notoriously bad candidate for a key, as there is no uniqueness guaranteed with that. Think "John Smith". Not only that, but having that as a key will bloat your foreign key relationship, as when you go to reference that primary key in a relationship table then you will have that same data type and column width. Think nvarchar() as opposed to a narrow data type (i.e. int).

As for your second option, it sounds like you have the ability to create a unique value (by your description, able to fit in an int column). Where would you plan on doing this? This would either be logic in the database or in the application. In my opinion, I like to keep entities like keys isolated to the data tier. This "id" would most likely be used for normalized data. That should be an abstraction to the application (in my opinion). So now my reasoning brings me to either a database-generated unique value (your option two) or an identity value (your option one). Those both have the same result (as far as I can see it), therefore why go through the trouble and the development to generate your own when you can just have an IDENTITY column?

Again, others may and probably will have different and equally objective answers for going with another alternative, but my vote would be for the IDENTITY column for a unique, surrogate key.