Sql-server – Int for identity/primary key and Guid for public identifier, best practices

performancesql serversql-server-2016t-sql

I almost always use an int for identity/primary (auto incrementing) clustered key for my tables. They're easy to query and debug, bounce around in SSMS, and plenty of other reasons.

I have a case where I'm creating an application that has your typical CRUD interface (create, read, update, delete) and I want to use a Guid in the url instead of the int.

An example table would be:

CustomerId int identity(1,1) PK clustered
CustomerName nvarchar(100)
Guid uniqueidentifier

In the past when someone was viewing a detail record I would just pass the identity:

/app/details/1 <-- id of the record

As I've seen in many applications, I like the idea of this:

/app/details/guidinstead

So in short, an int as the identity, and a guid as the public facing key.

What I'm concerned about is performance if any given table using this method grows substantially. There are business reasons for this, as in a person that has access can share a particular link with another to view details.

Another reason is because it does reduce some overhead on queries in a different manner, as in determining if the person has the "right" to view a particular record.

The initial hit is looking up the real ID of the record. After that all nested resources on the record are easily achieved with joins on the identity field across many tables.

I currently have no indexing set on the [Guid] uniqueidentifier column.

So to sum it up, I want to avoid showing incrementing keys in the URLs and I don't want a nightmare later on down the road if the tables end up with lots of data. Inserts and updates are currently using the int/identity column.

Operations are like this:

  1. When creating a record, I use Guid.NewGuid() in .NET to generate the Guid column.
  2. When updating a record, I find the CustomerId based on the Guid and update accordingly because this usually ends up updating/inserting in other relational tables as well and I need that CustomerId to do so.
  3. When viewing the details of the record, it's WHERE Guid = @Guid
  4. When deleting a record, it's WHERE Guid = @Guid

Questions:

  1. Is it worth having both int and guid in the table or should I just drop the int entirely?
  2. Any other settings I should place on the table?
  3. Is there a better option? Another impossible to guess sequence of some sort that doesn't take as much space?

It's early stages so easy for me to make changes now.

Best Answer

Is it worth having both int and guid in the table or should I just drop the int entirely?

Yes, from the perspective that you can have the identity column be a unique clustered index to take advantage of inserts being append-only to the table, and you can have a nonclustered primary key on your GUID column for both security and uniqueness when locating records.

Any other settings I should place on the table?

You could consider using NEWSEQUENTIALID to have the GUID inserts be less random, or try to match the behavior natively in .NET. If someone were quite dedicated, they could perhaps guess the next sequential GUID. If you think you might attract that kind of attention, generate them in a normal, random way.

I'd also wanna make the identity column a BIGINT. They're much harder to run out of than INTs. People who get all up in arms about the 4 extra bytes per row have never dealt with the process of altering that attribute after running out of INTs.

Is there a better option? Another impossible to guess sequence of some sort that doesn't take as much space?

Nope, you're on the right track.