Sql-server – Don’t rebuild indexes < 1000 pages - then why bother having them

indexsql-server-2008-r2

Microsoft recommends not bothering to rebuild indexes that have less than 1,000 pages (from memory I think it's because they are so small that they will be held in RAM).

If I'm not rebuilding them, then they'll become heavily fragmented. In that case is there any point in having them at all? I'm imagining a phone book where every single entry is in the wrong place – it would be useless!

For clarification I'm mainly talking about non-clustered indexes, though I'd be interested to know whether or not the answer would be different for clustered indexes.

The answer to this question suggests that indexes can help to avoid blocking issues behind row locks, which would be a reason why the indexes may still be useful if it is correct and applies to SQL Server.

Best Answer

Just because the index doesn't require rebuilding does no mean is useless. A 99999% fragmented index is still 9999 times better than a heap for looking up a key. The negative effects of fragmentation are heavily, heavily, heavily, heavily overestimated in popular DBA culture.

Locking is a valid point (w/o indexes all read-write operations are guaranteed to collide), but not the only one. Even if the data is read-only, so locking is never an issue, it still needs an index for seek and range scans operations.

I'm imagining a phone book where every single entry is in the wrong place

This is a very misguided interpretation of what fragmentation is and its effects. A fragmented phone book still has all the entries in order. Looking up an entry still takes 2-3 page flips, it will be something like:

  • the cover says to find 'P's go to page 645.
  • page 645 says to find 'PA's go to page 56.
  • page 56 says all 'PAUL's are on page 646.

So yes, you flip back and forth 2-3 times. The fragmented phonebook also may have every page half-blank, so it weight twice as much as the newly minted phonebook.

Compare it with the heap:

  • the cover will say I don't know where 'PAUL's are, but if they're in the book, I'm sure they're between the front cover and the back cover! And do remember that finding the first 'PAUL' does not mean you found all of them...

The only data/workload that really doesn't need indexes is columnstore/data warehousing. Columnstore access and processing is completely different and can get away w/o indexes, because every query is expected to scan all data anyway. Segment elimination, per-column IO and other forces contribute to keeping these always-end-to-end scans competitive.