I am currently migrating databases from SQL Server 2008 (SP4) to SQL Server 2017 (CU3). A significant change is that all filegroups have two data files after migration. To accomplish this task I restore a backup, add the new filegroup with two equally sized files as well as the same autogrowth settings and transfer the data by using the following syntax:
CREATE UNIQUE CLUSTERED INDEX <PK of the table> ..... WITH (DROP_EXISTING = ON ,...) ON <new Filegroup>
Unfortunately I also have to move some LOBs then things get a little more complicated:
- Add a partition scheme and partition function (basically with the same target)
- CREATE the clustered index (WITH DROP_EXISTING=ON) on the new partition scheme
- CREATE the clustered index (WITH DROP_EXISTING=ON) on the new filegroup
- DROP the partition scheme and function
This technique is described by Kimberly Tripp here and goes back to Brad Hoff.
Thank you for staying tuned and now to my question:
How much free space is needed in a filegroup by rebuilding an index like this?
I'll give you an example:
- I have one LOB table which is 220GB in size (according to total pages in sys.allocation_units divided by 128 divided by 1024).
- Presized the two files in my new (empty) filegroup (to host just that table) by 220/2 = 110 GB each with filegrowth = 0.
- Tried to transfer the table using the technique described above but got error message
Could not allocate space for object 'xyz'.'xyz_pk' in database 'abc' because the 'def' filegroup is full. Create disk space by deleting unneeded files, dropping objects in the filegroup, adding additional files to the filegroup, or setting autogrowth on for existing files in the filegroup.
- Increased the filesize to 112 GB each…tried again and got the same error message
- Finally turned on autogrowth for the files and the process finished successfully
however each file then was 220 GB in size leaving 50% of free space within each file.
Here is an output of the suggested diagnostic query
The file size adds up to 227,22 GB.
Up to now I do not know any other remedy as to perform DBCC SHRINKFILE
to get rid of the ridiculous high amount of free space. However that's not something I am particulary proud about…it takes ages potentially leaves corruption etc.etc.
Can you help me understanding why SQL Server allocates so much free space while afterwards happily filling my two files proportionally?
I'll try to prepare a demo afterwards…sorry but I am short on time now and perhaps some of you experts do already know the reason for this.
Thanks in advance for your help
Martin
Best Answer
First of all thank you guys for helping me to get on track with your comments.
I have now worked through an example and have a better understanding what's going on.
The problem arises with moving LOB-Data (such as VARCHAR(MAX), XML and so on) to another filegroup. When you rebuild a clustered index on another filegroup the LOB-Data stays at it's former place (set by the
TEXTIMAGE ON
command in the CREATE TABLE statement).One classic way to move the LOB-Data is to create a second table with the same structure in the new filegroup, copy data over, drop the old table and rename the new one. However this brings in all sorts of possible issues like lost data, invalidated data (because of missing check constraints) and error handling is quite tough. I have done this for one table in the past but IMHO it doesn't scale well and consider the nightmare of having to transfer 100 tables and you got errors for table 15, 33, 88 and 99 to fix.
Therefore I use a well-known trick regarding partitioning: As described by Kimberly Tripp LOB-Data does move to the new filegroup when you put partitioning on it. As I do not plan to use partitioning in the long run but just as a helper for moving that LOBs, the partition scheme is quite dull (throwing all partitions into one filegroup): I don't even care, which partition the data is on as I just want to get them moved. Actually this technique and the implementation is not invented by myself...I use a formidable script by Mark White. My mistake was to not fully understand what this script does and what the implications are....which I have now:
For LOB-Data it is necessary to rebuild (or recreate) the table (mostly the clustered index) twice: first with putting partitioning on it and second with removing the partitioning. Whether you use
SORT_IN_TEMPDB=ON
or not this results in having to provide the space of the original data TWICE: if your original table has 100MB, you need to provide 200MB for the operation to succeed. At the beginning I was quite puzzled, ending up with my new data files which had a lot of free space after the operation was finished.Now I accepted that I can't cheat around avoiding the free space. However I could avoid the necessity to shrink files afterwards. Therefore my solution is to do the first rebuild on a temporary filegroup and the second rebuild (removing partioning) on the destination filegroup. The temporary filegroup can be removed afterwards (if hopefully I don't hit the error message "The filegroup cannot be removed" (have a look at my question here) anymore.
Thanks for reading and your help
Martin
Here is a repro script for my problem which includes the solution I have come up for it: