Sql-server – SQL Server 2008: Unused Clustered Index

index-tuningsql serversql-server-2008-r2

I have an MS SQL Server 2008 R2 database that contains a table that has just under 100 million rows & is both read and write heavy.

It contains Tweets, and the id of each Tweet is one of the fields, but it is not used as the primary key, instead it has a non-clustered index on it & there is a separate field for the primary key "localPkId" (also the clustered index) that is just an identity and isn't actually used anywhere (so all other tables will still join on the Tweet ID, ignoring the "localPkId" field).

The Tweets that get inserted will come into the system out of order, so by not using Tweet ID as the clustered index, it means new rows will always get put at the end, speeding up inserts.

This had been working fine for some time as the table had a predominantly write heavy workload, but now it's being used for a lot of reads as well and I'm concerned that as the workload continues to grow this will become a problem.

First question: Does having "localPkId" offer any advantage over the table being a heap/having no clustered index?

Second question: Should the Tweet ID be the Primary Key & Clustered Index & then set a low fillfactor? If so, how would I go about choosing an appropriate value for the fillfactor in a case like this?

Best Answer

But you have an index on TweetID. Is that index not used by the joins?

Even a non-clustered index will fragment with insert and update. As that index fragments inserts take longer. How are you managing fill factor and index maintenance on this non-clustered index?

An unused PK does nothing.

TweetID as PK will save some space. What % of rows get added daily. If only 1% rows get added daily you could use a fill factor of 90% and defrag daily or weekly. Start with like 80% or 90% and see how fast it fragments.

If the TweetID are out of order but in the same range then the problem you have there is fill factor gets hammered in that range and unused in the rest of the range. But a defag is faster as it is only cleaning up that part of the range. In that case you may be better off with 100% fill factor and a regular index reorganize.

There is no cut and dry answer. You need to manage and monitor the index for your data and your workload.