Sql-server – Filegroup with multiple files: odd behaviour when moving stuff in

filegroupssql serversql-server-2017

I am currently migrating databases from SQL Server 2008 (SP4) to SQL Server 2017 (CU3). A significant change is that all filegroups have two data files after migration. To accomplish this task I restore a backup, add the new filegroup with two equally sized files as well as the same autogrowth settings and transfer the data by using the following syntax:

CREATE UNIQUE CLUSTERED INDEX <PK of the table> ..... WITH (DROP_EXISTING = ON ,...) ON <new Filegroup>

Unfortunately I also have to move some LOBs then things get a little more complicated:

  • Add a partition scheme and partition function (basically with the same target)
  • CREATE the clustered index (WITH DROP_EXISTING=ON) on the new partition scheme
  • CREATE the clustered index (WITH DROP_EXISTING=ON) on the new filegroup
  • DROP the partition scheme and function

This technique is described by Kimberly Tripp here and goes back to Brad Hoff.

Thank you for staying tuned and now to my question:

How much free space is needed in a filegroup by rebuilding an index like this?

I'll give you an example:

  • I have one LOB table which is 220GB in size (according to total pages in sys.allocation_units divided by 128 divided by 1024).
  • Presized the two files in my new (empty) filegroup (to host just that table) by 220/2 = 110 GB each with filegrowth = 0.
  • Tried to transfer the table using the technique described above but got error message

Could not allocate space for object 'xyz'.'xyz_pk' in database 'abc' because the 'def' filegroup is full. Create disk space by deleting unneeded files, dropping objects in the filegroup, adding additional files to the filegroup, or setting autogrowth on for existing files in the filegroup.

  • Increased the filesize to 112 GB each…tried again and got the same error message
  • Finally turned on autogrowth for the files and the process finished successfully

however each file then was 220 GB in size leaving 50% of free space within each file.

Here is an output of the suggested diagnostic query
filesize of table
The file size adds up to 227,22 GB.

Up to now I do not know any other remedy as to perform DBCC SHRINKFILE to get rid of the ridiculous high amount of free space. However that's not something I am particulary proud about…it takes ages potentially leaves corruption etc.etc.

Can you help me understanding why SQL Server allocates so much free space while afterwards happily filling my two files proportionally?

I'll try to prepare a demo afterwards…sorry but I am short on time now and perhaps some of you experts do already know the reason for this.

Thanks in advance for your help

Martin

Best Answer

First of all thank you guys for helping me to get on track with your comments.

I have now worked through an example and have a better understanding what's going on.

The problem arises with moving LOB-Data (such as VARCHAR(MAX), XML and so on) to another filegroup. When you rebuild a clustered index on another filegroup the LOB-Data stays at it's former place (set by the TEXTIMAGE ON command in the CREATE TABLE statement).

One classic way to move the LOB-Data is to create a second table with the same structure in the new filegroup, copy data over, drop the old table and rename the new one. However this brings in all sorts of possible issues like lost data, invalidated data (because of missing check constraints) and error handling is quite tough. I have done this for one table in the past but IMHO it doesn't scale well and consider the nightmare of having to transfer 100 tables and you got errors for table 15, 33, 88 and 99 to fix.

Therefore I use a well-known trick regarding partitioning: As described by Kimberly Tripp LOB-Data does move to the new filegroup when you put partitioning on it. As I do not plan to use partitioning in the long run but just as a helper for moving that LOBs, the partition scheme is quite dull (throwing all partitions into one filegroup): I don't even care, which partition the data is on as I just want to get them moved. Actually this technique and the implementation is not invented by myself...I use a formidable script by Mark White. My mistake was to not fully understand what this script does and what the implications are....which I have now:

For LOB-Data it is necessary to rebuild (or recreate) the table (mostly the clustered index) twice: first with putting partitioning on it and second with removing the partitioning. Whether you use SORT_IN_TEMPDB=ON or not this results in having to provide the space of the original data TWICE: if your original table has 100MB, you need to provide 200MB for the operation to succeed. At the beginning I was quite puzzled, ending up with my new data files which had a lot of free space after the operation was finished.

Now I accepted that I can't cheat around avoiding the free space. However I could avoid the necessity to shrink files afterwards. Therefore my solution is to do the first rebuild on a temporary filegroup and the second rebuild (removing partioning) on the destination filegroup. The temporary filegroup can be removed afterwards (if hopefully I don't hit the error message "The filegroup cannot be removed" (have a look at my question here) anymore.

Thanks for reading and your help

Martin

Here is a repro script for my problem which includes the solution I have come up for it:

    /*============================================================================
  Adapted the following file published by sqlskills to demonstrate filegrowth
  after partitioning for StackOverflow Question.

  Martin Guth, 02.02.2018

  File:     MovingLOBData.sql

  Summary:  Because 2012 supports online index rebuilds - even with LOB.
            You might think this means you can move LOB data around 
            (one of the VERY cool things you can do with IN_ROW data to 
            actually move it). However, the behavior of LOB data is NOT
            necessarily intuitive. This script will show you how/why/what!

  SQL Server Versions: SQL Server 2012
------------------------------------------------------------------------------
  Written by SQLskills.com

  (c) SQLskills.com. All rights reserved.

  For more scripts and sample code, check out 
    http://www.SQLskills.com

  You may alter this code for your own *non-commercial* purposes. You may
  republish altered code as long as you include this copyright and give due
  credit, but you must obtain prior permission before blogging this code.

  THIS CODE AND INFORMATION ARE PROVIDED "AS IS" WITHOUT WARRANTY OF 
  ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING BUT NOT LIMITED 
  TO THE IMPLIED WARRANTIES OF MERCHANTABILITY AND/OR FITNESS FOR A
  PARTICULAR PURPOSE.
============================================================================*/

SET NOCOUNT ON
go

USE master
go


CREATE DATABASE [TestLOB]
 CONTAINMENT = NONE
ON  PRIMARY 
(   NAME = N'TestLOBPrimary'
    , FILENAME = N'U:\DB_DATA\TestLOBPrimary.mdf' 
    , SIZE = 100MB , FILEGROWTH = 1024KB ), 

FILEGROUP [FG1] 
(   NAME = N'FG1File1'
    , FILENAME = N'U:\DB_DATA\FG1File1.ndf' 
    , SIZE = 40MB , FILEGROWTH = 20480KB ), 
(   NAME = N'FG1File2'
    , FILENAME = N'U:\DB_DATA\FG1File2.ndf' 
    , SIZE = 40MB , FILEGROWTH = 20480KB ), 

FILEGROUP [FG2] 
(   NAME = N'FG2File1'
    , FILENAME = N'U:\DB_DATA\FG2File1.ndf' 
    , SIZE = 20MB , FILEGROWTH = 0MB ), 
(   NAME = N'FG2File2'
    , FILENAME = N'U:\DB_DATA\FG2File2.ndf' 
    , SIZE = 20MB , FILEGROWTH = 0MB ),

FILEGROUP [tempLOB] 
(   NAME = N'tempLOB1'
    , FILENAME = N'U:\DB_DATA\templob1.ndf' 
    , SIZE = 20MB , FILEGROWTH = 0MB ), 
(   NAME = N'tempLOB2'
    , FILENAME = N'U:\DB_DATA\templob2.ndf' 
    , SIZE = 20MB , FILEGROWTH = 0MB )

LOG ON 
(   NAME = N'TestLOBLog'
    , FILENAME = N'U:\DB_DATA\TestLOBLog.ldf' 
    , SIZE = 10MB , FILEGROWTH = 10MB)
GO

USE TestLOB
go

ALTER DATABASE TestLOB 
MODIFY FILEGROUP FG1 DEFAULT
go

--DROP TABLE TestLobTable;

CREATE TABLE dbo.TestLobTable
(
    c1  int identity,
    c2  char(8000)      default 'this is a test',
    c3  varchar(max)    NULL
) -- will be created on FG1
go



INSERT INTO dbo.TestLobTable 
( 
    c2, 
    c3
)
VALUES
(
    'this is a test',
    REPLICATE (convert(varchar(max), 'ABC'), 8000) 
)
go 1000 

CREATE UNIQUE CLUSTERED INDEX TestLobTableCL 
ON TestLobTable (c1)
go


sp_blitzIndex
    @databaseName = 'TestLOB',
    @schemaName = 'dbo',
    @tableName = 'TestLobTable'

go

-- size is roughly 40 MB: 1,000 rows; 39.2MB; 31.3MB LOB


SELECT 
    f.name AS [filename], 
    fu.total_page_count AS [pageCount], 
    fu.total_page_count/128.0 [sizeMBTotal],
    (fu.total_page_count - unallocated_extent_page_count) /128.0 [sizeMBUsed],
    fu.*
FROM sys.dm_db_file_space_usage fu
INNER JOIN sys.database_files  f ON fu.file_id = f.file_id
INNER JOIN sys.filegroups fg ON fu.filegroup_id = fg.data_space_id
go

/*
filename    pageCount   sizeMBTotal sizeMBUsed
TestLOBPrimary  12800   100.000000  2.875000
FG1File1    5120    40.000000   19.687500
FG1File2    5120    40.000000   19.687500
FG2File1    2560    20.000000   0.062500
FG2File2    2560    20.000000   0.062500


--> 2*19,687 MB are occupied in Filegroup 1 ---> approx 40 MB in total in Filegroup 1
*/


/* moving Lob data using partitioning trick */
CREATE PARTITION FUNCTION PF_TestLobTable (int)
AS RANGE RIGHT FOR VALUES (0)
go


CREATE PARTITION SCHEME PS_TestLobTable 
AS PARTITION PF_TestLobTable
TO ( fg2, fg2 )
go


CREATE UNIQUE CLUSTERED INDEX TestLobTableCL 
ON TestLobTable (c1)
WITH (DROP_EXISTING = ON, SORT_IN_TEMPDB= OFF)
ON PS_TestLobTable (c1)
go



SELECT 
    f.name AS [filename], 
    fu.total_page_count AS [pageCount], 
    fu.total_page_count/128.0 [sizeMBTotal],
    (fu.total_page_count - unallocated_extent_page_count) /128.0 [sizeMBUsed],
    fu.*
FROM sys.dm_db_file_space_usage fu
INNER JOIN sys.database_files  f ON fu.file_id = f.file_id
INNER JOIN sys.filegroups fg ON fu.filegroup_id = fg.data_space_id
go

/*
filename    pageCount   sizeMBTotal sizeMBUsed
TestLOBPrimary  12800   100.000000  3.062500
FG1File1    5120    40.000000   0.062500
FG1File2    5120    40.000000   0.062500
FG2File1    2688    21.000000   19.750000
FG2File2    2688    21.000000   19.687500


--> now Filegroup 2 has roughly 40 MB data...interestingly the create index would fail if having 2*20MB capacity available but would pass at 2*21MB
*/

-- try to recreate the index again to get rid of partitioning
CREATE UNIQUE CLUSTERED INDEX TestLobTableCL 
ON TestLobTable (c1)
WITH (DROP_EXISTING = ON, SORT_IN_TEMPDB= OFF)
ON [FG2]
go

/* error message 1105
Could not allocate space for object 'dbo.TestLobTable'.'TestLobTableCL' in database 'TestLOB' because the 'FG2' filegroup is full. Create disk space by deleting unneeded files, dropping objects in the filegroup, adding additional files to the filegroup, or setting autogrowth on for existing files in the filegroup.

--> makes sense because no free space available in filegroup with partitioned clustered index already present
*/


ALTER DATABASE TestLOB MODIFY FILE (NAME = N'FG2File1', SIZE=41MB); 
ALTER DATABASE TestLOB MODIFY FILE (NAME = N'FG2File2', SIZE=41MB); 

-- rebuild again without sort in tempdb
CREATE UNIQUE CLUSTERED INDEX TestLobTableCL 
ON TestLobTable (c1)
WITH (DROP_EXISTING = ON,  SORT_IN_TEMPDB= OFF)
ON [FG2]
go

SELECT 
    f.name AS [filename], 
    fu.total_page_count AS [pageCount], 
    fu.total_page_count/128.0 [sizeMBTotal],
    (fu.total_page_count - unallocated_extent_page_count) /128.0 [sizeMBUsed],
    fu.*
FROM sys.dm_db_file_space_usage fu
INNER JOIN sys.database_files  f ON fu.file_id = f.file_id
INNER JOIN sys.filegroups fg ON fu.filegroup_id = fg.data_space_id
go

/*
filename    pageCount   sizeMBTotal sizeMBUsed
TestLOBPrimary  12800   100.000000  3.062500
FG1File1    5120    40.000000   0.062500
FG1File2    5120    40.000000   0.062500
FG2File1    5248    41.000000   19.750000
FG2File2    5248    41.000000   19.625000


--> now the files of FG2 have 50% free space left
*/

-- try to shrink with truncateonly
DBCC SHRINKFILE('FG2File1', TRUNCATEONLY);
DBCC SHRINKFILE('FG2File2', TRUNCATEONLY);


SELECT 
    f.name AS [filename], 
    fu.total_page_count AS [pageCount], 
    fu.total_page_count/128.0 [sizeMBTotal],
    (fu.total_page_count - unallocated_extent_page_count) /128.0 [sizeMBUsed],
    fu.*
FROM sys.dm_db_file_space_usage fu
INNER JOIN sys.database_files  f ON fu.file_id = f.file_id
INNER JOIN sys.filegroups fg ON fu.filegroup_id = fg.data_space_id
go

/*
filename    pageCount   sizeMBTotal sizeMBUsed
TestLOBPrimary  12800   100.000000  3.062500
FG1File1    5120    40.000000   0.062500
FG1File2    5120    40.000000   0.062500
FG2File1    5048    39.437500   19.750000
FG2File2    5040    39.375000   19.625000

--> no significant effect...still almost 50% free space
*/



-- recreate the table
DROP PARTITION SCHEME PS_TestLobTable;
DROP PARTITION FUNCTION PF_TestLobTable;


DROP TABLE TestLobTable;

CREATE TABLE dbo.TestLobTable
(
    c1  int identity,
    c2  char(8000)      default 'this is a test',
    c3  varchar(max)    NULL
) -- will be created on FG1
go



INSERT INTO dbo.TestLobTable 
( 
    c2, 
    c3
)
VALUES
(
    'this is a test',
    REPLICATE (convert(varchar(max), 'ABC'), 8000) 
)
go 1000 

CREATE UNIQUE CLUSTERED INDEX TestLobTableCL 
ON TestLobTable (c1)
go


sp_blitzIndex
    @databaseName = 'TestLOB',
    @schemaName = 'dbo',
    @tableName = 'TestLobTable'
GO



SELECT 
    f.name AS [filename], 
    fu.total_page_count AS [pageCount], 
    fu.total_page_count/128.0 [sizeMBTotal],
    (fu.total_page_count - unallocated_extent_page_count) /128.0 [sizeMBUsed],
    fu.*
FROM sys.dm_db_file_space_usage fu
INNER JOIN sys.database_files  f ON fu.file_id = f.file_id
INNER JOIN sys.filegroups fg ON fu.filegroup_id = fg.data_space_id
go

/*
filename    pageCount   sizeMBTotal sizeMBUsed
TestLOBPrimary  12800   100.000000  3.125000
FG1File1    5120    40.000000   19.687500
FG1File2    5120    40.000000   19.687500
FG2File1    5048    39.437500   0.062500
FG2File2    5040    39.375000   0.062500

--> data on filegroup 1 again... move them to filegroup2 this time with SORT_IN_TEMPDB
*/

/* moving Lob data using partitioning trick */
CREATE PARTITION FUNCTION PF_TestLobTable (int)
AS RANGE RIGHT FOR VALUES (0)
go


CREATE PARTITION SCHEME PS_TestLobTable 
AS PARTITION PF_TestLobTable
TO ( fg2, fg2 )
go


CREATE UNIQUE CLUSTERED INDEX TestLobTableCL 
ON TestLobTable (c1)
WITH (DROP_EXISTING = ON, SORT_IN_TEMPDB = ON )
ON PS_TestLobTable (c1)
go

-- removing partitioning on table
CREATE UNIQUE CLUSTERED INDEX TestLobTableCL 
ON TestLobTable (c1)
WITH (DROP_EXISTING = ON, SORT_IN_TEMPDB = ON )
ON [fg2]
go


-- now try to shrink with truncateonly
DBCC SHRINKFILE('FG2File1', 20,TRUNCATEONLY);
DBCC SHRINKFILE('FG2File2', 20,TRUNCATEONLY);


SELECT 
    f.name AS [filename], 
    fu.total_page_count AS [pageCount], 
    fu.total_page_count/128.0 [sizeMBTotal],
    (fu.total_page_count - unallocated_extent_page_count) /128.0 [sizeMBUsed],
    fu.*
FROM sys.dm_db_file_space_usage fu
INNER JOIN sys.database_files  f ON fu.file_id = f.file_id
INNER JOIN sys.filegroups fg ON fu.filegroup_id = fg.data_space_id
go

/*
filename    pageCount   sizeMBTotal sizeMBUsed
TestLOBPrimary  12800   100.000000  3.062500
FG1File1    5120    40.000000   0.062500
FG1File2    5120    40.000000   0.062500
FG2File1    5376    42.000000   19.687500
FG2File2    5376    42.000000   19.687500

--> no significant effect...still almost 50% free space
*/


--- retry with separate filegroup


CREATE DATABASE [TestLOB]
 CONTAINMENT = NONE
ON  PRIMARY 
(   NAME = N'TestLOBPrimary'
    , FILENAME = N'U:\DB_DATA\TestLOBPrimary.mdf' 
    , SIZE = 100MB , FILEGROWTH = 1024KB ), 

FILEGROUP [FG1] 
(   NAME = N'FG1File1'
    , FILENAME = N'U:\DB_DATA\FG1File1.ndf' 
    , SIZE = 40MB , FILEGROWTH = 20480KB ), 
(   NAME = N'FG1File2'
    , FILENAME = N'U:\DB_DATA\FG1File2.ndf' 
    , SIZE = 40MB , FILEGROWTH = 20480KB ), 

FILEGROUP [FG2] 
(   NAME = N'FG2File1'
    , FILENAME = N'U:\DB_DATA\FG2File1.ndf' 
    , SIZE = 20MB , FILEGROWTH = 0MB ), 
(   NAME = N'FG2File2'
    , FILENAME = N'U:\DB_DATA\FG2File2.ndf' 
    , SIZE = 20MB , FILEGROWTH = 0MB ),

FILEGROUP [tempLOB] 
(   NAME = N'tempLOB1'
    , FILENAME = N'U:\DB_DATA\templob1.ndf' 
    , SIZE = 20MB , FILEGROWTH = 0MB ), 
(   NAME = N'tempLOB2'
    , FILENAME = N'U:\DB_DATA\templob2.ndf' 
    , SIZE = 20MB , FILEGROWTH = 0MB )

LOG ON 
(   NAME = N'TestLOBLog'
    , FILENAME = N'U:\DB_DATA\TestLOBLog.ldf' 
    , SIZE = 10MB , FILEGROWTH = 10MB)
GO

USE TestLOB
go

ALTER DATABASE TestLOB 
MODIFY FILEGROUP FG1 DEFAULT
go

--DROP TABLE TestLobTable;

CREATE TABLE dbo.TestLobTable
(
    c1  int identity,
    c2  char(8000)      default 'this is a test',
    c3  varchar(max)    NULL
) -- will be created on FG1
go



INSERT INTO dbo.TestLobTable 
( 
    c2, 
    c3
)
VALUES
(
    'this is a test',
    REPLICATE (convert(varchar(max), 'ABC'), 8000) 
)
go 1000 

CREATE UNIQUE CLUSTERED INDEX TestLobTableCL 
ON TestLobTable (c1)
go


sp_blitzIndex
    @databaseName = 'TestLOB',
    @schemaName = 'dbo',
    @tableName = 'TestLobTable'

go

-- size is roughly 40 MB: 1,000 rows; 39.2MB; 31.3MB LOB


SELECT 
    f.name AS [filename], 
    fu.total_page_count AS [pageCount], 
    fu.total_page_count/128.0 [sizeMBTotal],
    (fu.total_page_count - unallocated_extent_page_count) /128.0 [sizeMBUsed],
    fu.*
FROM sys.dm_db_file_space_usage fu
INNER JOIN sys.database_files  f ON fu.file_id = f.file_id
INNER JOIN sys.filegroups fg ON fu.filegroup_id = fg.data_space_id
go

/*
filename    pageCount   sizeMBTotal sizeMBUsed
TestLOBPrimary  12800   100.000000  2.875000
FG1File1    5120    40.000000   19.687500
FG1File2    5120    40.000000   19.687500
FG2File1    2560    20.000000   0.062500
FG2File2    2560    20.000000   0.062500


--> 2*19,687 MB are occupied in Filegroup 1 ---> approx 40 MB in total in Filegroup 1
*/

/* moving Lob data using partitioning trick */
CREATE PARTITION FUNCTION PF_TestLobTable (int)
AS RANGE RIGHT FOR VALUES (0)
go


CREATE PARTITION SCHEME PS_TestLobTable 
AS PARTITION PF_TestLobTable
TO ( tempLOB, tempLOB )
go


CREATE UNIQUE CLUSTERED INDEX TestLobTableCL 
ON TestLobTable (c1)
WITH (DROP_EXISTING = ON)
ON PS_TestLobTable (c1)
go


SELECT 
    f.name AS [filename], 
    fu.total_page_count AS [pageCount], 
    fu.total_page_count/128.0 [sizeMBTotal],
    (fu.total_page_count - unallocated_extent_page_count) /128.0 [sizeMBUsed],
    fu.*
FROM sys.dm_db_file_space_usage fu
INNER JOIN sys.database_files  f ON fu.file_id = f.file_id
INNER JOIN sys.filegroups fg ON fu.filegroup_id = fg.data_space_id
go

/*
filename    pageCount   sizeMBTotal sizeMBUsed
TestLOBPrimary  12800   100.000000  3.062500
FG1File1    5120    40.000000   0.062500
FG1File2    5120    40.000000   0.062500
FG2File1    2560    20.000000   0.062500
FG2File2    2560    20.000000   0.062500
tempLOB1    2560    20.000000   19.687500
tempLOB2    2560    20.000000   19.687500


--> 2*19,687 MB are occupied in Filegroup 1 ---> approx 40 MB in total in Filegroup 1
*/



CREATE UNIQUE CLUSTERED INDEX TestLobTableCL 
ON TestLobTable (c1)
WITH (DROP_EXISTING = ON)
ON [FG2]
go


SELECT 
    f.name AS [filename], 
    fu.total_page_count AS [pageCount], 
    fu.total_page_count/128.0 [sizeMBTotal],
    (fu.total_page_count - unallocated_extent_page_count) /128.0 [sizeMBUsed],
    fu.*
FROM sys.dm_db_file_space_usage fu
INNER JOIN sys.database_files  f ON fu.file_id = f.file_id
INNER JOIN sys.filegroups fg ON fu.filegroup_id = fg.data_space_id
go

/*
filename    pageCount   sizeMBTotal sizeMBUsed
TestLOBPrimary  12800   100.000000  3.062500
FG1File1    5120    40.000000   0.062500
FG1File2    5120    40.000000   0.062500
FG2File1    2560    20.000000   19.625000
FG2File2    2560    20.000000   19.750000
tempLOB1    2560    20.000000   0.062500
tempLOB2    2560    20.000000   0.062500


--> data successfully moved to fg2, tempLOB empty
*/
DROP PARTITION SCHEME PS_TestLobTable 
DROP PARTITION FUNCTION PF_TestLobTable 

ALTER DATABASE TestLOB REMOVE FILE tempLOB1;
ALTER DATABASE TestLOB REMOVE FILE tempLOB2;
ALTER DATABASE TestLOB REMOVE FILEGROUP tempLOB;

/*
    summary:
    - Moving LOB Data with the help of partitioning results in twice the space needed of the original data for temporary rebuilds.
    - To avoid problematic and long running shrinking of database files it's best to use a different filegroup for the first rebuild as this can be easily removed afterwards.
*/