Sql-server – DATETIME2 and IDENTITY as composite key

database-designsql serversql-server-2008-r2

I am currently developing a new database schema with an opportunity to do things right. The purpose of the development effort is to collect data (events) from various sensors that will be recorded with a time/date stamp.

Currently there will be two disconnected databases at two different locations capturing data that may or may not occur on the same day. On a monthly basis, the drives will be pulled and sent to the customer's main office to ingest the data and merge into one database.

Once the data is available, the customer will replay the data for analysis as if they were actually there when it was captured.

Since recalling this data needs to be performance driven, I know right off the bat I want a clustered index on DATETIME2.

This leads me to the problem of designing the primary key to account for merging.

Would a composite primary key of DATETIME2 and an IDENTITY column be unique enough to avoid any collisions when the data is merged? For example, some data may be captured at a 50hz to 100hz rate.

Or is that too much of a wild card and it would be best to use a GUID as the PK? If that is the case, how should I handle a clustered index for performance (for example) where a child table would have a PK of two columns: a GUID from the parent table and a 3 char column "product" identifier?

Would it be best to add a DATETIME2 column to become the clustered index or am I making a mountain out of a mole hill?

Best Answer

Don't confuse the primary key with the key of the clustered index. Though they are often the same, they are independent. You just need your primary key to uniquely identify a record in your table across all the databases that have a copy of it.

Since recalling this data needs to be performance driven, I know right off the bat I want a clustered index on DATETIME2.

I agree with this choice, since you will primarily be writing, reading, and merging data by this column. Be sure to pick the right precision for your DATETIME2.

As for the primary key, I suggest making it (system, id), where system is a unique system identifier and id is an IDENTITY value. A CHAR(1) or TINYINT will suffice for the system identifier if you have only a handful of systems.

Don't bother managing the ranges of the IDENTITY values across the different systems. They can overlap as they please since the system identifier will distinguish records from different systems that share the same id.