Sql-server – DBCC SHRINKFILE actually doing

shrinksql server

I have a database which was 11TB in size. I recently truncated over 5TB of data from this database.

(I am fully familiar with all the reasons why you would not normally shrink a database please)

I'm curious about what the DBCC SHRINKFILE command is actually doing because when I run the command to shrink a file which is about 650000MB in size with 45% free space it doesn't seem to actually move any pages around.

The code in question is:

USE [CAF] 
GO 
DBCC SHRINKFILE (N'FILENAME' , 650239) 
GO

When I monitor the performance metrics of the server on executing DBCC SHRINKFILE I see the following

% Active time of the disk immediately jumps to 100%.
The sqlservr.exe and system processes both start reading at a continuous 2MB/s
each (sometimes peaking at 5MB/s).
The disk queue length jumps to exactly 1
Disk iops jump to about 250 (note the disks in tests can achieve 6000iops)
CPU remains idle

Even if I tell DBCC SHRINKFILE to shrink the file by a single MB and let it run for 30 minutes, it does not complete.

Everything I've read suggests that DBCC SHRINKFILE should take pages from the end of the file and move them to free space near the beginning but even at the worst performance it should take no more than a few minutes to move 1MB worth of pages

What is DBCC SHRINKFILE actually doing, it just doesn't make sense to me?

Best Answer

Everything I've read suggests that DBCC SHRINKFILE should take pages from the end of the file and move them to free space near the beginning, What is DBCC SHRINKFILE actually doing?

The steps that DBCC SHRINKFILE goes through:

DbccSpaceReclaim Reclaim free space. E.G. by purging empty extents.
DbccFilesCompact move the non-LOB pages to before the point specified as the second parameter, as close to the beginning of the data file as possible
DbccLOBCompact move the LOB pages to before the point specified as the second parameter, as close to the beginning of the data file as possible

During the DbccFilesCompact & DbccLOBCompact pages are moved from after the parameter passed to the DBCC SHRINKFILE statement to before the parameter, as close to the data pages as possible. if it is possible to do so

AFTER all these page movements, the actual truncation happens and you see the size change.

Source

Monitoring the DBCC SHRINKFILE process can give you more information on which of these three takes the longest / uses the most resources (but I have noticed DbccFilesCompact running while Lob Pages where being moved when testing, will have to do some investigating there).

According to this blogpost, Shrinking files is done in batches of ~32 pages per transaction. As to get the ability to resume the shrink when stopping the process.

The statement that when shrinking your database files, you can resume approx. where you left off is also confirmed in this answer.

It seems like nothing happened when cancelling the command, but pages would have been moved in these smaller batches. The size difference is only noticed at the end due to the truncation happening at the end of the DbccCompact events.

Fully logged

When looking at the log file while a shrink is happening, this is also confirmed due to many transaction being logged:

LOP_BEGIN_XACT
LOP_MODIFY_ROW
LOP_MODIFY_ROW
... --many more LOP_MODIFY_ROW records
LOP_COMMIT_XACT
LOP_BEGIN_XACT
...

We are seeing all these LOP_MODIFY_ROW records because of the fact that shrinking is fully logged in the transaction log. Which has additional overhead.

If you don't want all the logging happening at once or you need more time in between shrink operations you could run the shrink in batches, with or without WAITFOR delays.

Another side effect of the multiple transactions is that blocking can occur on multiple occasions while shrinking or trying to shrink your file.

Again, to mitigate blocking you can try shrinking in smaller batches, with time in between shrink.

The first shrink you should try is with: TRUNCATEONLY as to not move the pages and simply try and release space to the OS.

DBCC SHRINKFILE (N'FILENAME' , TRUNCATEONLY)

The database contains lots of binary files and XML data in the MBs how will that impact the shrink?

Shrinking Lob data can take a very long time due to additional index / table scan's to update the pointers. Take a look at this post by Paul Randal

In short, you could see the shrink taking a long time processing DbccLobCompact.

This is due to a pointer in the pages on the table pointing to the 'OFF ROW' values (Lob data) but not the other way around. When moving Lob Data due to the shrink, we have to update the pointer in the on row pages resulting in a lot of extra overhead.

The overhead is similar for nonclustered indexes.

@TiborKaraszi also mentioned:

A couple of more considerations (apart from the LOB aspect). Shrink need an X lock, on which it will wait on forever. Moving a heap page means updating all NC indexes that points to all rows that were affected by this heap-page-movement.

Meaning that the reason for your shrink operation taking so long is that your shrink operation might be blocked by queries & that it can block other queries while it is running.

Additionally, shrinking heaps can be expensive due to the NC index updates.

TL;DR

All in all shrinking larger files with larger objects will take longer due to scanning and updating the pointers in the underlying clustered / nonclustered indexes.

Even if 1mb of lob data is moved, the objects would have to be read and updated accordingly. The speed of this will also be based on your IO subsystem.

Another valid reason could be that your query is blocked due to the exclusive locking behavior.

While the shrink is running you could check for blocking with this query:

SELECT
est.TEXT,
er.blocking_session_id,
er.last_wait_type,
er.reads,
er.cpu_time,
er.total_elapsed_time
FROM sys.dm_exec_requests er
CROSS APPLY sys.dm_exec_sql_text(sql_handle) AS est
WHERE blocking_session_id != 0;

Is shrinking a single big lob record an all or nothing operation?

Another reason where I was not entirely sure of is that an entire Lob object and it pages will be moved as far to the front of the data file as possible, even if not all pages are over the threshold defined in the DBCC SHRINKFILE.

EDIT: Tested this and this statement is false, only the Lob pages above the threshold are moved with DBCCLobCompact even if not all the pages of one record are above this threshold.

When testing with DBCC IND, we only see a few lob pages at the end of the file being moved, even if they belong to one record.

Extra notes

Shrinking files is single threaded
The better alternative for shrinking that Paul Randal points out in one of his blogposts:

Create a new filegroup

Move all affected tables and indexes into the new filegroup using the CREATE INDEX … WITH (DROP_EXISTING = ON) ON syntax, to move the tables and remove fragmentation from them at the same time

Drop the old filegroup that you were going to shrink anyway (or shrink it way down if its the primary filegroup)

Shrinking will cause heavy fragmentation

Related Solutions

Sql-server – Strange behaviour DBCC Shrinkfile

Shrinkfile on a datafile is a single-threaded operation, reusing a small memory buffer.

So the Ninja hardware hasn't got an edge with the extra memory and the 80 cores.

Your local PC however has the benefit of local I/O latency (local disk, i.e. not having to make multiple trips to the SAN).

Sql-server – How to save the results of DBCC SHRINKFILE into a table

This can be accomplished somewhat easily by using a "loop back" Linked Server that has the 'remote proc transaction promotion' property set to false, which avoids the Cannot perform a shrinkfile operation inside a user transaction error by side-stepping the implicit transaction started by the INSERT...EXEC operation. The Linked Server will otherwise only require that the 'rpc out' property be set to true. The 'data access' and 'rpc' properties do not need to be enabled (at least not for me).

Create the Linked Server

USE [master];

DECLARE @LinkedServerName sysname;
SET @LinkedServerName = N'TestDBCC';

IF (NOT EXISTS (SELECT srv.name
                FROM   sys.servers srv
                WHERE  srv.server_id <> 0
                AND    srv.name = @LinkedServerName))
BEGIN
  EXEC [master].dbo.sp_addlinkedserver
    @server = @LinkedServerName, @srvproduct=N'', @provider=N'SQLNCLI11', @datasrc=N'DALI';
  EXEC [master].dbo.sp_addlinkedsrvlogin
    @rmtsrvname=@LinkedServerName,@useself=N'True',@locallogin=NULL,
    @rmtuser=NULL,@rmtpassword=NULL;
END;

--EXEC [master].dbo.sp_serveroption @server=@LinkedServerName,
--  @optname=N'data access', @optvalue=N'true';
--EXEC [master].dbo.sp_serveroption @server=@LinkedServerName,
--  @optname=N'rpc', @optvalue=N'true'; -- is_remote_login_enabled
EXEC [master].dbo.sp_serveroption @server=@LinkedServerName,
  @optname=N'rpc out', @optvalue=N'true';
EXEC [master].dbo.sp_serveroption @server=@LinkedServerName,
  @optname=N'remote proc transaction promotion', @optvalue=N'false';
GO

Use the Linked Server via EXEC() AT

CREATE TABLE #ShrinkFileResults
(
    [DBID] SMALLINT,
    [FileID] INT,
    [CurrentSize] INT,
    [MinimumSize] INT,
    [UsedPages] INT,
    [EstimatedPages] INT
);

INSERT INTO #ShrinkFileResults
  ([DBID], [FileID], [CurrentSize], [MinimumSize], [UsedPages], [EstimatedPages])
  EXEC(N'
     USE [tempdb];
     DBCC SHRINKFILE(N''tempdev'', 0, TRUNCATEONLY);
  ') AT [TestDBCC];

SELECT * FROM #ShrinkFileResults;

This worked for me on both SQL Server 2012 SP3 and SQL Server 2016 RTM.

NOTES

Using OPENQUERY does not work. Initially you get a "meta-data discovery" error which can be resolved by wrapping the DBCC call in another EXEC that uses the WITH RESULT SETS option. But then you get an error about "OPENQUERY cannot process object as it either does not exist or you don't have permission to it". But if you change the logical name of the file to shrink, then you get an error from DBCC stating that it cannot find a file of that name, along with a secondary error from OPENQUERY stating that the call did not return any result sets, all of which indicates that with the correct file name it is get farther.
This operation can also be rather simply accomplished via SQLCLR. One can create a UDF / Scalar function, in fact, using an external connection (not a Context Connection), which would allow for being called in a set-based operation in the SELECT list. That would also only take a few lines of code, and minimize security risk when compared to enabling xp_cmdshell or creating a Linked Server.

Best Answer

Related Solutions

Sql-server – Strange behaviour DBCC Shrinkfile

Sql-server – How to save the results of DBCC SHRINKFILE into a table

Related Question