I have a database which was 11TB in size. I recently truncated over 5TB of data from this database.
(I am fully familiar with all the reasons why you would not normally shrink a database please)
I'm curious about what the DBCC SHRINKFILE command is actually doing because when I run the command to shrink a file which is about 650000MB in size with 45% free space it doesn't seem to actually move any pages around.
The code in question is:
USE [CAF]
GO
DBCC SHRINKFILE (N'FILENAME' , 650239)
GO
When I monitor the performance metrics of the server on executing DBCC SHRINKFILE I see the following
- % Active time of the disk immediately jumps to 100%.
- The sqlservr.exe and system processes both start reading at a continuous 2MB/s
each (sometimes peaking at 5MB/s). - The disk queue length jumps to exactly 1
- Disk iops jump to about 250 (note the disks in tests can achieve 6000iops)
CPU remains idle
Even if I tell DBCC SHRINKFILE to shrink the file by a single MB and let it run for 30 minutes, it does not complete.
Everything I've read suggests that DBCC SHRINKFILE should take pages from the end of the file and move them to free space near the beginning but even at the worst performance it should take no more than a few minutes to move 1MB worth of pages
What is DBCC SHRINKFILE actually doing, it just doesn't make sense to me?
Best Answer
The steps that
DBCC SHRINKFILE
goes through:DbccSpaceReclaim
Reclaim free space. E.G. by purging empty extents.DbccFilesCompact
move the non-LOB pages to before the point specified as the second parameter, as close to the beginning of the data file as possibleDbccLOBCompact
move the LOB pages to before the point specified as the second parameter, as close to the beginning of the data file as possibleDuring the
DbccFilesCompact
&DbccLOBCompact
pages are moved from after the parameter passed to theDBCC SHRINKFILE
statement to before the parameter, as close to the data pages as possible. if it is possible to do soAFTER all these page movements, the actual truncation happens and you see the size change.
Source
Monitoring the
DBCC SHRINKFILE
process can give you more information on which of these three takes the longest / uses the most resources (but I have noticed DbccFilesCompact running while Lob Pages where being moved when testing, will have to do some investigating there).According to this blogpost, Shrinking files is done in batches of ~32 pages per transaction. As to get the ability to resume the shrink when stopping the process.
The statement that when shrinking your database files, you can resume approx. where you left off is also confirmed in this answer.
It seems like nothing happened when cancelling the command, but pages would have been moved in these smaller batches. The size difference is only noticed at the end due to the truncation happening at the end of the
DbccCompact
events.Fully logged
When looking at the log file while a shrink is happening, this is also confirmed due to many transaction being logged:
We are seeing all these
LOP_MODIFY_ROW
records because of the fact that shrinking is fully logged in the transaction log. Which has additional overhead.If you don't want all the logging happening at once or you need more time in between shrink operations you could run the shrink in batches, with or without
WAITFOR
delays.Another side effect of the multiple transactions is that blocking can occur on multiple occasions while shrinking or trying to shrink your file.
Again, to mitigate blocking you can try shrinking in smaller batches, with time in between shrink.
The first shrink you should try is with:
TRUNCATEONLY
as to not move the pages and simply try and release space to the OS.Shrinking Lob data can take a very long time due to additional index / table scan's to update the pointers. Take a look at this post by Paul Randal
In short, you could see the shrink taking a long time processing
DbccLobCompact
.This is due to a pointer in the pages on the table pointing to the '
OFF ROW
' values (Lob data) but not the other way around. When moving Lob Data due to the shrink, we have to update the pointer in the on row pages resulting in a lot of extra overhead.The overhead is similar for nonclustered indexes.
@TiborKaraszi also mentioned:
Meaning that the reason for your shrink operation taking so long is that your shrink operation might be blocked by queries & that it can block other queries while it is running.
Additionally, shrinking heaps can be expensive due to the NC index updates.
TL;DR
All in all shrinking larger files with larger objects will take longer due to scanning and updating the pointers in the underlying clustered / nonclustered indexes.
Even if 1mb of lob data is moved, the objects would have to be read and updated accordingly. The speed of this will also be based on your IO subsystem.
Another valid reason could be that your query is blocked due to the exclusive locking behavior.
While the shrink is running you could check for blocking with this query:
Is shrinking a single big lob record an all or nothing operation?
Another reason where I was not entirely sure of is that an entire Lob object and it pages will be moved as far to the front of the data file as possible, even if not all pages are over the threshold defined in the
DBCC SHRINKFILE
.EDIT: Tested this and this statement is false, only the Lob pages above the threshold are moved with
DBCCLobCompact
even if not all the pages of one record are above this threshold.When testing with
DBCC IND
, we only see a few lob pages at the end of the file being moved, even if they belong to one record.Extra notes