Sql-server – I/O errors backing up to an SMB Share from MSSQL Server 2014

awssql server 2014

I'm having some issues with SQL backups which are being written to an SMB share on a AWS Storage Gateway Appliance from an EC2 SQL Server. I'm consistently getting the same error in the SQL Event log for two SQL instances saying:

BackupIoRequest::ReportIoError: Write failure on backup device
'Path_To_Backup' Operating System Error 59 (An unexpected network
error occurred)

This causes several DB backups to fail at exactly the same time to the second, resulting in an invalid backup file. This happens 2 hours into the backup of this 150-200GB database at around 1AM. The weird part is I have other servers backing up to this path which do not experience the failure. I've been working with AWS Support and they've been unable to find a related network or Storage Gateway Error. I have scoured the Windows/SQL Logs but can't find any correlated events. What errors I do have seem to be dead ends for my particular error.

Is there anything I can monitor in SQL server to collect more information about the backup failure than I'm getting from the default Windows/SQL Logs? Is it a valid backup strategy to backup to a share when you're writing ~600gb a night from many SQL instances?

Best Answer

Just following up on this. Backups have been running flawlessly since we made a change to the SMB timeout in the registry:

[HKEY_LOCAL_MACHINE\\SYSTEM\\CurrentControlSet\\Services\\LanmanWorkstation\\Parameters]  
"SessTimeout"=dword:00000258

What that works out to is 600 seconds or a 10 minute timeout for SMB sessions from the SQL Server, the default is 60 seconds. it needs to be saved in hex.

I found it in the article: CIFS and SMB Timeouts in Windows

It's the line with a heading of Client Session timeout.

We arrived at this solution after verifying in a packet capture that the session was being ended by the SQL Server sending an RST, ACK to end the session.

Related Question