SQL Server – How Initial Size of a Compressed Backup is Calculated

When a backup is created, SQL Server guesses(?) the a size of the intial backup file. Later on, maybe when it appends the log the size will be re-adjusted, sometimes multiple times until the final size is reached.
The bigger the difference, the longer the backup takes. (compared to another backup which size isn't adjusted that much later on).
Example:
Database1 (size 500GB, 70 GB log used) is backed up with compression.
The .bak file is created with 85 GB in size, after some time the CPU usage goes up and I can see that the .bak file is re-adjusted to 136 GB, this happens again until the final size of 178GB for the backup is reached.

When these re-calculations happens the average backup speed in MB/s is reduced compared to other backups on the same machine. Is the only coming from appending and clearing the log? Or is it also because of the different data types used in the database? Meaning that they are compressed with different compression rates, or anything?

I came to the topic, because I knew that my backups take around 30 minutes, but now it took 1,5 hours even though the database didnt grow to 300% of it size. It grew only 30%.

Using wait stats I could see that besides BackupIO I was also waiting for CPU (SOS_SCHEDULER_YIELD) at some point.

Machine Details:
VMWare 6.0 
32 GB of Memory (Max memory 28GB given to SQL Server)
2 logical CPU
Max Degree of Parallelism (1, it's a Sharepoint 2013)   
SQL Server 2014
Windows Server 2012 R2
running on an SSD Raid (5)

Of course I have instant file initialization enabled for the user who runs the backup.

Best Answer

How does SQL Server calculate the initial size of a (compressed) backup?

From BOL :

For compressed backups, the size of the final backup file depends on how compressible the data is, and this is unknown before the backup operation finishes. Therefore, by default, when backing up a database using compression, the Database Engine uses a pre-allocation algorithm for the backup file. This algorithm pre-allocates a predefined percentage of the size of the database for the backup file. If more space is needed during the backup operation, the Database Engine grows the file. If the final size is less than the allocated space, at the end of the backup operation, the Database Engine shrinks the file to the actual final size of the backup.

To allow the backup file to grow only as needed to reach its final size, use trace flag 3042. Trace flag 3042 causes the backup operation to bypass the default backup compression pre-allocation algorithm.

You should follow - Backup and restore best practices in SharePoint 2013.

Also, to me your VM having 2 CPU seems to be underspec for running a 70GB database (Not sure if there is only one database or multiple databases).

Best Answer

Related Solutions

SQL Server Log File – How Shrinking Affects Performance

Sql-server – Difference in Size of SQL Server 2008 R2 backup on Local disk and on Remote Network Share

Related Question