Sql-server – Index maintenance for AG DBs

availability-groupsfragmentationindex-maintenancesql serversql-server-2012

I wanted to get some opinions on the index maintenance we currently do at the company where I work. One of our production SQL 2012 clusters that consists of 4 nodes, with two instances on each, has multiple AGs which service some very heavy workloads. Some of the databases in these AGs are 2TB+.

We have a standard daily index maintenance routine that does the usual rebuild vs reorg depending on the fragmentation level, but we also only do reorgs in indexes that are over a certain size because we have seen issues with SYNC latency if these larger indexes were to be rebuilt. Once index maintenance has been performed we then update statistics etc.

This job can sometimes run up to 12 hours+ so it impacts on our critical business hours where we see peak traffic, so we really need to do something to alleviate this.

I've seen quite a bit of commentary recently where it has been suggested that index maintenance isn't always required at all, and I suspect that may be the case for us where we are doing a reorg on large indexes that are only 5% fragmented.

I guess I'd like some ideas on how I could identify indexes where the level of maintenance we do daily isn't necessarily required, other than disabling it and observing the impact, if any.

Best Answer

You're not alone on seeing sync latency with AGs when doing index rebuilds. From Sean Gallardy's blog post SQL Server Index Maintenance – You’re Doing It Wrong (emphasis mine):

In the worst of the cases, which does seem to happen quite often, it can bring the entire server to its knees, causing long blocking waits, filling up the log, causing remote AG replicas to fall behind, among a large list of other items. This does have real and painful consequences.

Note that an important caveat mentioned in the comments of that post is that fragmentation can affect the performance of SQL Server's readahead mechanism, which is used heavily during large table scans (ETL loads, big reporting queries).

I honestly like your suggestion of disabling maintenance and seeing what happens. You could disable it on just the very large tables, and see if there is any negative impact vs the gains you get in terms of shorter maintenance windows.

To comment on this specific statement in your question:

Once index maintenance has been performed we then update statistics etc.

FYI, rebuilding an index will result in statistics being updated for that index with FULLSCAN. So make sure you're not updating stats twice on these tables that are getting rebuilt.

You should also take a look at Erik Darling's blog post Because Your Index Maintenance Script Is Measuring The Wrong Thing. You could check avg_page_space_used_in_percent to determine if there are indexes where you can reclaim a lot of space (on disk and in memory) by rebuilding. This would probably be the biggest "win" from rebuilds.

On a related note, another of Erik's posts (What Metrics Does Rebuilding Indexes Improve?) points out some specific situations that might warrant an ad-hoc rebuild:

It’s my opinion, and you can take it or leave it, that index rebuilds should be reserved for special circumstances.

  • You deleted a lot of data
  • You need to change something about the index
  • You have a Heap with a lot of forwarded fetches

Unfortunately, there isn't an easy button here. Hopefully the links help inform your decision process though.