Sql-server – DBCC SHRINKFILE actually doing

shrinksql server

I have a database which was 11TB in size. I recently truncated over 5TB of data from this database.

(I am fully familiar with all the reasons why you would not normally shrink a database please)

I'm curious about what the DBCC SHRINKFILE command is actually doing because when I run the command to shrink a file which is about 650000MB in size with 45% free space it doesn't seem to actually move any pages around.

The code in question is:

USE [CAF] 
GO 
DBCC SHRINKFILE (N'FILENAME' , 650239) 
GO

When I monitor the performance metrics of the server on executing DBCC SHRINKFILE I see the following

  • % Active time of the disk immediately jumps to 100%.
  • The sqlservr.exe and system processes both start reading at a continuous 2MB/s
    each (sometimes peaking at 5MB/s).
  • The disk queue length jumps to exactly 1
  • Disk iops jump to about 250 (note the disks in tests can achieve 6000iops)
    CPU remains idle

Even if I tell DBCC SHRINKFILE to shrink the file by a single MB and let it run for 30 minutes, it does not complete.

Everything I've read suggests that DBCC SHRINKFILE should take pages from the end of the file and move them to free space near the beginning but even at the worst performance it should take no more than a few minutes to move 1MB worth of pages

What is DBCC SHRINKFILE actually doing, it just doesn't make sense to me?

Best Answer

Everything I've read suggests that DBCC SHRINKFILE should take pages from the end of the file and move them to free space near the beginning, What is DBCC SHRINKFILE actually doing?

The steps that DBCC SHRINKFILE goes through:

  • DbccSpaceReclaim Reclaim free space. E.G. by purging empty extents.
  • DbccFilesCompact move the non-LOB pages to before the point specified as the second parameter, as close to the beginning of the data file as possible
  • DbccLOBCompact move the LOB pages to before the point specified as the second parameter, as close to the beginning of the data file as possible

During the DbccFilesCompact & DbccLOBCompact pages are moved from after the parameter passed to the DBCC SHRINKFILE statement to before the parameter, as close to the data pages as possible. if it is possible to do so

AFTER all these page movements, the actual truncation happens and you see the size change.

enter image description here

Source

Monitoring the DBCC SHRINKFILE process can give you more information on which of these three takes the longest / uses the most resources (but I have noticed DbccFilesCompact running while Lob Pages where being moved when testing, will have to do some investigating there).

According to this blogpost, Shrinking files is done in batches of ~32 pages per transaction. As to get the ability to resume the shrink when stopping the process.

The statement that when shrinking your database files, you can resume approx. where you left off is also confirmed in this answer.

It seems like nothing happened when cancelling the command, but pages would have been moved in these smaller batches. The size difference is only noticed at the end due to the truncation happening at the end of the DbccCompact events.

Fully logged

When looking at the log file while a shrink is happening, this is also confirmed due to many transaction being logged:

LOP_BEGIN_XACT
LOP_MODIFY_ROW
LOP_MODIFY_ROW
... --many more LOP_MODIFY_ROW records
LOP_COMMIT_XACT
LOP_BEGIN_XACT
...

We are seeing all these LOP_MODIFY_ROW records because of the fact that shrinking is fully logged in the transaction log. Which has additional overhead.

If you don't want all the logging happening at once or you need more time in between shrink operations you could run the shrink in batches, with or without WAITFOR delays.

Another side effect of the multiple transactions is that blocking can occur on multiple occasions while shrinking or trying to shrink your file.

Again, to mitigate blocking you can try shrinking in smaller batches, with time in between shrink.

The first shrink you should try is with: TRUNCATEONLY as to not move the pages and simply try and release space to the OS.

DBCC SHRINKFILE (N'FILENAME' , TRUNCATEONLY)

The database contains lots of binary files and XML data in the MBs how will that impact the shrink?

Shrinking Lob data can take a very long time due to additional index / table scan's to update the pointers. Take a look at this post by Paul Randal

In short, you could see the shrink taking a long time processing DbccLobCompact.

This is due to a pointer in the pages on the table pointing to the 'OFF ROW' values (Lob data) but not the other way around. When moving Lob Data due to the shrink, we have to update the pointer in the on row pages resulting in a lot of extra overhead.

The overhead is similar for nonclustered indexes.


@TiborKaraszi also mentioned:

A couple of more considerations (apart from the LOB aspect). Shrink need an X lock, on which it will wait on forever. Moving a heap page means updating all NC indexes that points to all rows that were affected by this heap-page-movement.

Meaning that the reason for your shrink operation taking so long is that your shrink operation might be blocked by queries & that it can block other queries while it is running.

Additionally, shrinking heaps can be expensive due to the NC index updates.

TL;DR

All in all shrinking larger files with larger objects will take longer due to scanning and updating the pointers in the underlying clustered / nonclustered indexes.

Even if 1mb of lob data is moved, the objects would have to be read and updated accordingly. The speed of this will also be based on your IO subsystem.

Another valid reason could be that your query is blocked due to the exclusive locking behavior.

While the shrink is running you could check for blocking with this query:

SELECT
est.TEXT,
er.blocking_session_id,
er.last_wait_type,
er.reads,
er.cpu_time,
er.total_elapsed_time
FROM sys.dm_exec_requests er
CROSS APPLY sys.dm_exec_sql_text(sql_handle) AS est
WHERE blocking_session_id != 0;

Is shrinking a single big lob record an all or nothing operation?

Another reason where I was not entirely sure of is that an entire Lob object and it pages will be moved as far to the front of the data file as possible, even if not all pages are over the threshold defined in the DBCC SHRINKFILE.

EDIT: Tested this and this statement is false, only the Lob pages above the threshold are moved with DBCCLobCompact even if not all the pages of one record are above this threshold.

When testing with DBCC IND, we only see a few lob pages at the end of the file being moved, even if they belong to one record.


Extra notes

  • Shrinking files is single threaded
  • The better alternative for shrinking that Paul Randal points out in one of his blogposts:
  • Create a new filegroup
  • Move all affected tables and indexes into the new filegroup using the CREATE INDEX … WITH (DROP_EXISTING = ON) ON syntax, to move the tables and remove fragmentation from them at the same time
  • Drop the old filegroup that you were going to shrink anyway (or shrink it way down if its the primary filegroup)
  • Shrinking will cause heavy fragmentation