I personally have used Raxco PerfectDisk (not associated with the company or any of their employees in any way) to do online defrags of SQL Server LUNs. Works perfectly, if it does slow the server down a bit. I would recommend doing it during periods of lighter activity. When I say "works perfectly" I am referring to it not corrupting the volume or the SQL data files.
The built-in defragmenter does a very poor job if certain structures are fragmented on the drive. PerfectDisk shows you details about all the items that are fragmented including the NTFS allocation tables, directories, alternate file streams, etc. etc.
Does PerfectDisk defragment SQL databases? http://support.raxco.com/KB/a106/does-perfectdisk-defragment-sql-databases.aspx
See this archive copy of the Technet News Magazine regarding SQL Server and fragmentation: http://web.archive.org/web/20100803204458/http://www.microsoft.com/technet/abouttn/flash/tips/tips_083104.mspx
I didn't get a chance to rerun the trace and capture the more nicely formatted XDL file, but in the meantime I was able to resolve the issue by removing all the isolation level and locking hints from the trigger and modifying the application code to run the INSERT itself from within a serializable transaction.
The app uses LINQ to SQL, so the INSERT statement is dynamically generated and run under the default READ COMMITTED isolation level. From the deadlock trace, I could tell that the locks taken by the INSERT were conflicting with the locks taken by the trigger. The deadlock was happening even after removing all the isolation and locking hints from the trigger, so I surmised that the trigger's UPDATE statements were causing stronger locks than the INSERT statement. And when the competing simultaneous transaction tried to do the same thing, the deadlock occurred.
We were definitely on the right track with our previous attempted solutions: using higher isolation levels and stronger locking in the trigger. The problem is that the weaker locks were already taken by the INSERT statement generated by the LINQ to SQL, before the trigger ever got called. The solution is to explicitly start a transaction in our .NET code using SERALIZABLE isolation level, do the insert in that transaction, let the trigger happen in the same transaction (without altering the isolation level or lock hints) and then commit and dispose the transaction. When I did that, I was no longer able to reproduce the issue (I even put it back to the old way and verified that the issue returned.) Here is the new code:
' Do the insert in a serializable transaction to prevent simultaneous inserts from deadlocking
' due to the trigger, which reaches out to other rows.
Me.packageContext.Connection.Open()
Try
Using oTransaction = Me.packageContext.Connection.BeginTransaction(IsolationLevel.Serializable)
Me.packageContext.Transaction = oTransaction
Try
Me.packageContext.SubmitChanges()
oTransaction.Commit()
Catch ex As Exception
oTransaction.Rollback()
Throw
End Try
End Using
Finally
' We need to close the connection and null out the transaction to allow
' subsequent uses of the same DataContext to work properly outside this transaction.
Me.packageContext.Connection.Close()
Me.packageContext.Transaction = Nothing
End Try
Normally we avoid opening transactions from within client side code, because they can run long if the code does remote API calls, causing excessive blocking. But in this case we are isolating a very small bit of code, and we have robust error/rollback handling around it.
An alternate solution would be to implement stored procedures that do the INSERTs within explicit SERIALIZABLE transactions, and then alter the DataContext to call the stored procedure. However, that solution is less maintainable as it requires updating both the stored procedure and the DataContext definition every time the schema changes.
Finally, another potential alternate solution would be to examine the deadlock log in detail and find a way to adjust our indexes or keys to avoid the deadlock in the first place. However, that is more difficult and may not be possible with our design.
I think the solution above is the simplest and most understandable for future developers.
Best Answer
SQL Server uses a write-ahead log (WAL), so any completed transactions should have at least been flushed to disk (unless you are using delayed durability). This means that when your database comes back online, all committed transactions will be replayed from the log file. In-progress transactions will be rolled back as well.
Despite the WAL architecture, I'm sure Windows didn't appreciate the abrupt disk loss. It certainly seems possible that data could have been corrupted - for instance, if a single 8 KB page was halfway done being flushed to the physical log file, and the disk was removed at that instant, it seems like the log would be corrupted.
If you're using the Full recovery model, you could likely still do a point in time recovery up to right before the corruption. But there would be some data loss due to the partial write.
There's really not much to be done other than running
DBCC CHECKDB
. This will check the database to see if it was corrupted during this unusual event.Another option would be to restore a backup (from before this event) to a separate database / server, and then use a comparison tool (like SQL Data Compare from Red Gate) to see if there is any data that's different between the backup and the current database.