Sql-server – Best option to index an application log table for quick insertion and retrieval in date order

indexsql serversql-server-2017

Suppose I have an application trace log table with three columns: (id uniqueidentifier, message nvarchar(max), and started datetime2).

The primary key (clustered) is on [id]. I need to be able to insert records as quickly as possible, but I'm not sure if that's the best primary key. It makes sense because I'm using EntityFramework and that's the default, but the uniquidentifier type has pretty random values for each insertion, so it would insert rows all over the place when clusted on id. I think that's good, because it lowers page contention (vs inserting them all on the last data page all the time), but I'm not sure about that. At the same time, I need a way of retrieving the rows quickly in order by [started].

My question is, would it be better to:
1. Change the primary key to be composite on (started, id) so it's the only index and the data pages are in order by date, or should I
2. Leave the primary key on [id] and add a unique non-clustered index on (started, id).

My reasoning for including both (started,id) in the 2nd option is so it can be a unique index that would include the clustering key (which is always implicitly included anyway). I don't want to include any other columns in the index because it would duplicate a lot of [message] data unnecessary, when all I'm really interested in is an index that speeds retrieval of rows in order. Or, is there a better option than the one's I mentioned?

Best Answer

I think that's good, because it lowers page contention (vs inserting them all on the last data page all the time), but I'm not sure about that.

At insert rates of under 10,000/sec hot page latch contention is not a big issue, and the efficiency and locality of end-of-index inserts is preferable.

  1. Change the primary key to be composite on (started, id) so it's the only index and the data pages are in order by date

Yes. Prefer the solution where you don't need two indexes. And look here for a solution to using a datetime2 as the only unique index on the table by filling the sub-second precision with values generated by a sequence.