Sql-server – Maintain Up-To-Date Copy of Very Large Table if Clustering/Partitioning/Moving Operation Needs to be Killed/Rollback

clustered-indexhigh-availabilitypartitioningrollbacksql server

I've appended the following information in a previous post but I asked the 'why' instead of the 'how'.

So here's the question…

SQL Server 2014, CU6. 4.5 TB database, Simple recovery. The most important/accessed table is 1.6 TB, located on the primary filegroup. It's currently about 95% fragmented and so most all queries need to do a full scan of its 600M+ rows. Not surprisingly, it's an IO bottleneck and I'm unable to defrag in any acceptable window of time. My goal is to move this table out of the .mdf and cluster/partition onto its own filegroup (ONLINE=ON & SORT_IN_TEMPDB=ON). This new filegroup would be physically isolated on a dedicated disk array.

But I'm very concerned about any issue that may cause (or necessitate) a rollback (things such as a power failure, deadlocking, corruption, unacceptably slow performance during the operation, etc.).

Further, given the length of time this clustering/partitioning/moving process may take, any backups taken during that time would be useless. They'd immediately go into recovery after being restored and, again, rollback/rollfoward time would be unacceptable. Same deal if I were to turn on full logging and use transactional replication or log shipping — worthless secondary or subscriber.

How would you handle this process?

I've thought about making a copy of the table, triggers on the source to keep the copy up-to-date.

I've got large buffers of space on all volumes involved. I want to pull the trigger. Boss says 'not yet'.

Thanks for reading…

-Greg

Best Answer

I've done this a few times. My servers are only 2008R2, however. I've moved tables like this and 'partitioned' data both because data files were reaching the max size (16tb) and identity columns were reaching the integer maximum.

Index rebuilds on big tables can take some time to roll back. Reorganization does not, however it is much slower (single threaded). You can kill it immediately and it will just stop where it is, so you could maybe do it over the course of several nights if need be. I had to do this with some quite unwieldy full text catalogs.

Rebuilding clustered indexes to new file groups is not an online operation if you have a (MAX) data type in your table -- see comments below. That was something that surprised me when I was working on my process. Also, be careful with sorting in tempdb. I once attempted a full statistics scan on a 3tb table and my 500gb tempdb LUN ran out of space overnight.

Let me know if you have any questions.

Thanks