Sql-server – FileStream Db restore

filestreamrestoresql serversql-server-2016

Microsoft SQL Server 2016 (SP1-CU6) (KB4037354) – 13.0.4457.0 (X64)

The database's total size is 450 GB, from which 350GB is inital filestream size
The average picture size is around 200Kb, the total number is around 118,000, and same number of filestream files.
The database configuration has 1 filestream file
I have two virtual machines (with IFI enabled on both):
- The first has read/write of 30MB/s, the second has 300MB/s
- The backup on the first VM takes around 4 hours, restore takes around 40 hours
- Restoring on the second VM takes around 4 hours

The question is, why is SQL server on both VMs not using max IO speed of disks when it is clearly available?
The only wait types during both restores are BACKUPIO and BACKUPITHREAD.

Best Answer

The question is, why is SQL server on both VMs not using max IO speed of disks when it is clearly available?

Your testing methodology:

2.) Backup is a single file 3.) Slower one is on SAN with RAID 1, Faster one RAID 1 attached disks 4.) When doing the restore on both instances, servers are idle, and nothing except SQL server is active. Throughput is calculated by comparing fileshare copy/paste through resource monitor, by crystal disk mark, by disk speed during primary filegroup backup, by this restore with filestream.

Unfortunately the testing methodology was flawed, severely, when it comes to how SQL Server works.

2.) Backup is a single file

This will cause SQL Server to use very little resources to read and write so as not to create too much contention on the files and volumes. To get the best backup/restore performance you'll need to backup and restore to/from multiple files and will probably have to change the buffercount and maxtransfersize options also.

3.) Slower one is on SAN with RAID 1, Faster one RAID 1 attached disks

I would totally expect that, there would be more latency to get to the SAN than DAS. This probably (other than latency) won't effect the outcome too awful much, assuming the SAN has the same or more cache than the local disk controllers and the latency isn't too terrible.

Throughput is calculated by comparing fileshare copy/paste through resource monitor, by crystal disk mark [...]

File copies and crystal disk mark are using multiple threads to read and write data combined with buffered I/O. They use completely different flags when opening/writing to files on disk, among even more differences. What's applicable to these tests isn't applicable to other applications or services. It's not that one is good or bad, just they behave completely different.

How can you speed this up?

Backup/Restore to multiple files
Change BUFFERCOUNT and MAXTRANSFERSIZE
Use separate disks on separate controllers for reading vs writing

Related Solutions

SQL Server 2008 R2 – Restore Database Excluding FILESTREAM Data

What you're trying to do would leave the database in a (transactionally) inconsistent state, hence it isn't possible.

The Partial Database Availability whitepaper is a useful reference guide and includes an example of how to check whether a particular table or file is online. If your data access were through stored procedures, you could relatively easily incorporate that check.

One alternative (but somewhat hacky) approach that might be worth a look in your scenario would be to hide the table and replace it with a view.

-- NB: SQLCMD script
:ON ERROR EXIT
:setvar DatabaseName "TestRename"
:setvar FilePath "D:\MSSQL\I3\Data\"

SET STATISTICS TIME OFF;
SET STATISTICS IO OFF;
SET NOCOUNT ON;
GO

USE master;
GO

IF EXISTS (SELECT name FROM sys.databases WHERE name = N'$(DatabaseName)')
  DROP DATABASE $(DatabaseName)
GO

CREATE DATABASE $(DatabaseName) 
ON PRIMARY 
  (
  NAME = N' $(DatabaseName)'
  , FILENAME = N'$(FilePath)$(DatabaseName).mdf'
  , SIZE = 5MB
  , MAXSIZE = UNLIMITED
  , FILEGROWTH = 1MB
  ) 
, FILEGROUP [FG1] DEFAULT
  ( 
  NAME = N' $(DatabaseName)_FG1_File1'
  , FILENAME = N'$(FilePath)$(DatabaseName)_FG1_File1.ndf'
  , SIZE = 1MB
  , MAXSIZE = UNLIMITED
  , FILEGROWTH = 1MB 
  ) 
, FILEGROUP [FG2] CONTAINS FILESTREAM
  ( 
  NAME = N'$(DatabaseName)_FG2'
  , FILENAME = N'$(FilePath)Filestream'
  )
LOG ON 
  ( 
  NAME = N'$(DatabaseName)_log'
  , FILENAME = N'$(FilePath)$(DatabaseName)_log.ldf'
  , SIZE = 1MB
  , MAXSIZE = UNLIMITED
  , FILEGROWTH = 1MB
  )
GO

USE $(DatabaseName);
GO

CREATE TABLE [dbo].[BinaryContent](
    [BinaryContentID] [int] IDENTITY(1,1) NOT NULL
    , [FileName] [varchar](50) NOT NULL
    , [BinaryContentRowGUID] [uniqueidentifier] ROWGUIDCOL UNIQUE DEFAULT (NEWSEQUENTIALID()) NOT NULL
  , [FileContentBinary] VARBINARY(max) FILESTREAM  NULL
) ON [PRIMARY] FILESTREAM_ON [FG2]
GO 

-- Insert test rows
INSERT
  dbo.BinaryContent
  (
  [FileName]
  , [FileContentBinary]
  )
VALUES
  (
  CAST(NEWID() AS VARCHAR(36))
  , CAST(REPLICATE(NEWID(), 100) AS VARBINARY)
  );
GO 100

USE master;
GO

-- Take FILESTREAM filegroup offline
ALTER DATABASE $(DatabaseName)
MODIFY FILE (NAME = '$(DatabaseName)_FG2', OFFLINE)
GO

USE $(DatabaseName);
GO

-- Rename table to make way for view
EXEC sp_rename 'dbo.BinaryContent', 'BinaryContentTable', 'OBJECT';
GO

-- Create view to return content from table but with NULL FileContentBinary
CREATE VIEW dbo.BinaryContent
AS

SELECT
  [BinaryContentID]
    , [FileName] 
    , [BinaryContentRowGUID]
  , [FileContentBinary] = NULL
FROM
  [dbo].[BinaryContentTable];
GO

-- Check results as expected
SELECT TOP 10
  *
FROM
  dbo.BinaryContent;
GO

Sql-server – SQL Server Leaked Transactions

There is a SQL command that will show OPEN transactions. (DBCC OPENTRAN)

http://msdn.microsoft.com/en-us/library/ms182792.aspx

Displays information about the oldest active transaction and the oldest distributed and nondistributed replicated transactions, if any, within the specified database. Results are displayed only if there is an active transaction or if the database contains replication information

Best Answer

How can you speed this up?

Related Solutions

SQL Server 2008 R2 – Restore Database Excluding FILESTREAM Data

Sql-server – SQL Server Leaked Transactions

Related Question