Sql-server – FileStream Db restore

filestreamrestoresql serversql-server-2016

Microsoft SQL Server 2016 (SP1-CU6) (KB4037354) – 13.0.4457.0 (X64)

  • The database's total size is 450 GB, from which 350GB is inital filestream size
  • The average picture size is around 200Kb, the total number is around 118,000, and same number of filestream files.
  • The database configuration has 1 filestream file
  • I have two virtual machines (with IFI enabled on both):

    • The first has read/write of 30MB/s, the second has 300MB/s

    • The backup on the first VM takes around 4 hours, restore takes around 40 hours

    • Restoring on the second VM takes around 4 hours

The question is, why is SQL server on both VMs not using max IO speed of disks when it is clearly available?
The only wait types during both restores are BACKUPIO and BACKUPITHREAD.

Best Answer

The question is, why is SQL server on both VMs not using max IO speed of disks when it is clearly available?

Your testing methodology:

2.) Backup is a single file 3.) Slower one is on SAN with RAID 1, Faster one RAID 1 attached disks 4.) When doing the restore on both instances, servers are idle, and nothing except SQL server is active. Throughput is calculated by comparing fileshare copy/paste through resource monitor, by crystal disk mark, by disk speed during primary filegroup backup, by this restore with filestream.

Unfortunately the testing methodology was flawed, severely, when it comes to how SQL Server works.

2.) Backup is a single file

This will cause SQL Server to use very little resources to read and write so as not to create too much contention on the files and volumes. To get the best backup/restore performance you'll need to backup and restore to/from multiple files and will probably have to change the buffercount and maxtransfersize options also.

3.) Slower one is on SAN with RAID 1, Faster one RAID 1 attached disks

I would totally expect that, there would be more latency to get to the SAN than DAS. This probably (other than latency) won't effect the outcome too awful much, assuming the SAN has the same or more cache than the local disk controllers and the latency isn't too terrible.

Throughput is calculated by comparing fileshare copy/paste through resource monitor, by crystal disk mark [...]

File copies and crystal disk mark are using multiple threads to read and write data combined with buffered I/O. They use completely different flags when opening/writing to files on disk, among even more differences. What's applicable to these tests isn't applicable to other applications or services. It's not that one is good or bad, just they behave completely different.

How can you speed this up?

  • Backup/Restore to multiple files
  • Change BUFFERCOUNT and MAXTRANSFERSIZE
  • Use separate disks on separate controllers for reading vs writing