SQL Server Maintenance – Updating Statistics and Stopping Index Defragmentation

fragmentationmaintenancesql server

I have several vendor databases that have most (99%) of the tables clustered on GUIDs. Currently we are rebuilding the tables every weekend. Most tables are 80% or more fragmented by this time.

Instead of this endless cycle, would we be better off just not rebuilding?

Would it be more beneficial to update statics more often instead of rebuilding the indexes?

I have read Stop Worrying About SQL Server Fragmentation by Brent Ozar, but it doesn't really give an answer to my question. I was hoping for a more complete article and possibly one that was updated more recently.

I am trying not to do this as I know it is not necessary. If it helps to understand my situation better please see the beginning part of this article again from Brent's blog. I would like to have a formal plan with articles and use cases for this to propose to management and go through the proper documentation before making the change.

I am working to design a new maintenance plan for our production databases. Our only maintenance is the weekly rebuild. There are several (30+) databases that are all about 700 GB-1.2 TB. The tables range from 100-300 GB in size for the largest.

I would like to have a better maintenance plan that provides better performance in the end.

Best Answer

Statistics are used by the Query Optimizer to help estimate row counts for operations so that it can determine the most efficient means of accomplishing that operation.

Since you are using GUIDs as the Primary Key in these tables, it is highly doubtful that you are doing any range operations / multi-row operations. Since GUIDs have no inherent order, you are really only ever doing single-row operations. And since these are Primary Keys, they are "unique" and any value only exists once, or not at all, hence your operations are either going to affect 1 row when filtering on the PK, or all rows if not filtering on anything (I am ignoring other filters for the moment since those would use statistics for another index, or an auto-created statistic on a column, neither of which are impacted by this particular issue). In this sense, merely updating statistics isn't going to help much (or at least shouldn't).

At the same time, since operations against these PKs are singleton operations (i.e. seeks), those are not greatly impacted by fragmentation. In this sense, rebuilding the Clustered PKs is not going to have a terribly noticeable affect on those operations. HOWEVER, rebuilding (at least occassionally) still has merit.

So my advice is this:

The easier but less overall effective path:

Since your system is currently suffering from a rampant page-split debacle, you need to free up that wasted / unused space as there is a lot of wasted space across many data pages that are not only filling up the Buffer Pool, but its also making backup/restore operations take longer. So, try doing index REORGANIZE operations for a few weeks and then a REBUILD. I'm not sure what Edition of SQL Server you are using, but if not using Enterprise then REBUILD operations are offline, yet REORGANIZE operations are online.

Doing online operations not only reduces the need for the maintenance window, but they can be done more often throughout the week. You can even stagger which indexes get rebuilt when and perhaps pick the top N index, based on fragmentation level, and REORGANIZE a certain number of them each night.
The harder but much more effective path:

Even if the REORGANIZE and occasional REBUILD plan helps, you are still really just using a bucket to scoop water out of a boat that has a hole in it. But your system is growing and so that "hole" is getting bigger. All of this REBUILD / REORG stuff is merely an effort to live with a bad situation. It might work for a while, it might even work forever, but there could easily come a point when it does not work. The absolutely better approach is to actually fix the bad situation, not just sweep it under the rug.

That absolutely better approach is to remodel the tables to use INT / BIGINT columns for the Clustered PKs. If the app code uses the GUIDs, or if the GUIDs are known by external systems and hence referenced by external systems, then you can still do this major modification: just keep the GUID values in one table, create a NonClustered Index on the column, and then do a lookup on those values at the beginning of operations to translate them to the internal INT / BIGINT values. This will shrink the size of all but a few tables (especially when considering that NonClustered Indexes have the Clustered Index keys duplicated in them, hence most of your NonClustered Indexes are 16 bytes larger than the sum of the index keys, when they could be just 4 or 8 bytes larger if the Clustered Key was INT or BIGINT, respectively). Smaller tables take up less space in the Buffer Pool, hence pages stay cached for longer and queries are more efficient. Backup and restore operations are also faster. Index REORG and REBUILD operations are also faster.

I have suggested approach #2 before to you in this answer: Best fill factor for GUID clustering key. So, while approach #1 might help, I don't think your situation will drastically improve until moving forward with approach #2. And I do fully understand that it is much easier said than done since I don't have to deal with time / resource constraints and other complicating factors. BUT, I will say that if management is not keen on pursuing such a large project, they need to consider the value-proposition of having a faster system, spending less money on disk space (data files, log files, and backups), reduced down-time / maintenance window, reduced recurring support time due to you needing to "fix" this ever-worsening situation at least once per year, etc, etc, etc. In the long-term, this approach is actually cheaper. This is your best "maintenance plan" that provides for the best performance :-). And it can be accomplished in phases, so it doesn't have to be done in one massive release.

Best Answer

Related Solutions

Sql-server – Sql Server Maintenance Plan – Best Practices on Tasks and Scheduling

Sql-server – Maintenance Plan Takes Too Long – Locks Tables – Indexing To Blame

Related Question