I recently installed a series of Windows Updates on our SQL Server 2008 R2 server. Note, these were not SQL Server updates, they were just the usual monthly Windows updates. These updates included a couple of re-boots, but otherwise this was just routine maintenance. Since then however, SQL Server has been running at least 4 times slower than usual – backups and queries – and I'm at a loss to figure out why. Backups which used to take 3 hours now take 20. I appreciate there are a myriad reasons why queries can be slow on a restart, but backups?
I've heard there were issues with the Balanced Power Plan when upgrading from SP2 to SP3 with SQL Server 2008 R2, but this server has had SP3 installed for over a year now. I tried changing to the High Performance Plan anyway, but that made no difference to the speed of the backups.
I also checked if there were any pending sp_configure changes that would kick-in (or kick-out) after a restart, but the Error Logs did not show any changes.
Is it possible that the Network Adapters would somehow reset after a re-boot? One of the Windows updates was for the "Intel Ethernet Server Adapter X520-2". Could that be an issue? Would the Jumbo Frames size, or anything other configuration value, reset itself after a driver update?
To give more information about the slow backups, the biggest database I backup is about 3TB, hosted on one Windows drive letter. That Windows drive is made up of six 820GB spindle drives (strictly speaking, it's twelve drives in a RAID-1 formation). Previously, this database would backup in about 3.5 hours at a rate of 192 MB/sec. Now it takes 20 hours at a rate of 32 MB/sec, and I don't understand what's changed. It seems that in the past, data must have been pulled from all six drives simultaneously during the backup to achieve 192 MB/sec, but now I'm guessing the backup is pulling data from just one drive at a time (checking with PerfMon), hence 32 MB/sec. Why? Could this have been caused by a change in the RAID controller, the network adapters, the SQL Server backup command itself, or what? And is it typical for data to be read/written from an array of drives in a parallel fashion? If so, what piece of hardware controls this?
Best Answer
This is a bit of a fishing expedition but here would be my troubleshooting process assuming you don't have your own stats to help clear things up.
Check the history of database size and compressed backup size from your msdb catalogs. Have they changed over the past few weeks.
You need to talk to the staff who are managing the Infrastructure and SAN.
If you're still stuck with no metrics of any kind, go back to your msdb information and use the timings to calculate an MB/s for backups. Now take a copy only backup to NUL (note: this will put a negative load on your system of course) and compare.
If all of this fails for some reason...
As you can see, if you have no previously recorded information about how the system should be acting, it's a matter of thinking of everything it could possibly be, and working out how you're going to rule that in or out through expensive and time wasting investigation. But them's the breaks.