SQL Server Error Handling – Explicit Transaction Left Open After Try/Catch

error handlingsql servert-sqltransaction

I recently wrote a T-SQL script to perform some updates and inserts on 3 different tables. I wanted it done in a single transaction, so I read up on the Microsoft documentation on how to use explicit transactions with try/catch.

According to the documentation, it can be done like this:

BEGIN TRANSACTION;

BEGIN TRY
    -- Generate a constraint violation error.
    DELETE FROM Production.Product
    WHERE ProductID = 980;
END TRY

BEGIN CATCH
    SELECT 
        ERROR_NUMBER() AS ErrorNumber
        ,ERROR_SEVERITY() AS ErrorSeverity
        ,ERROR_STATE() AS ErrorState
        ,ERROR_PROCEDURE() AS ErrorProcedure
        ,ERROR_LINE() AS ErrorLine
        ,ERROR_MESSAGE() AS ErrorMessage;

    IF @@TRANCOUNT > 0
        ROLLBACK TRANSACTION;
END CATCH;

IF @@TRANCOUNT > 0
    COMMIT TRANSACTION;
GO

So I implemented this pattern and put all my updates and inserts into the TRY clause. Problem is, the code managed to leave an open transaction after the query finished, and I do not see how this would happen. Regardless of what work is done inside the TRY-clause, what possible scenarios could cause a query like this to leave a transaction open?

Best Answer

There are a few types of errors that are not caught by the T-SQL TRY / CATCH structure (taken from the MSDN page for TRY...CATCH and slightly edited for clarity):

  • Errors that have a severity of 20 or higher that stop the SQL Server Database Engine task processing for the session. If an error occurs that has severity of 20 or higher and the database connection is not disrupted, TRY…CATCH will handle the error.
  • Attentions, such as client-interrupt requests or broken client connections.
  • When the session is ended by a system administrator by using the KILL statement.
  • Errors that occur at the same level of execution as the TRY…CATCH construct during statement-level recompilation, such as object name resolution errors that occur after compilation because of deferred name resolution.

For the example code provided in the Question (i.e. a single DML statement), there is a very simple fix: remove the explicit transaction. For a single DML statement an explicit transaction is unnecessary unless there is additional logic that does a check on some condition and might optionally decide to undo that change via a ROLLBACK done in the main TRY and not as part of the CATCH since it wasn't a database engine error. Outside of that particular scenario, I can't see a reason (a good one, at least) to use an explicit transaction for a single DML statement. But I would still keep the TRY / CATCH structure for general error handling.

When it comes to multiple DML statements (which is the true context of the Question as stated in the wording of it above the example code) then you obviously do need an explicit transaction, so here are some thoughts on that:

  • A comment by @usr on the Question suggests to skip the error handling at the database layer altogether and instead handle it at the app layer. While app code can handle the transactions and error handling, if you are not using an ORM like Entity Framework exclusively (meaning you have stored procedures that are called by SQL Agent jobs and/or by support people via SSMS), then those stored procedures need to include the transaction and error handling because the app layer isn't always the initiator of the execution, yet you still want transaction and error handling in those other situations. I discuss this more in the following answer: Are we required to handle Transaction in C# Code as well as in stored procedure
  • Uncommitted transactions are automatically rolled-back when the connection ends. This is from SQL Server's perspective, not the client's perspective. Meaning, the connection is no longer listed in sys.dm_exec_connections.
  • Connection pooling complicates the auto-rollback handling since it, by design, keeps the connection open, even after the client "closes" it. When the connection remains open so that another "connection" attempt can simply re-use it, the session itself is not "cleaned up" until a new connection is opened and a query batch is executed! Upon the first execution submitted by the app code on a "new" connection that is actually reusing a connection from the pool, an internal process labeled as sp_reset_connection is called that will, among other things, roll back an uncommitted transaction in that session that is now being reused. The problem here is when connection pooling is being used, and there is a termination of the process (a Command Timeout perhaps), and no new connection is being requested and no new query is being submitted. In this case the connection just sits in the pool and the session still exists and no cleanup operation is being requested. But this doesn't last forever (though still longer than you want in these cases). According to the MSDN page for SQL Server Connection Pooling (ADO.NET) (in the Removing Connections section):

    The connection pooler removes a connection from the pool after it has been idle for approximately 4-8 minutes, or if the pooler detects that the connection with the server has been severed.

  • Cancelling a stored procedure execution isn't just a matter of your app code getting a CommandTimeout or calling an explicit "cancel". SQL Server Management Studio (SSMS), or any IDE, is also client code. And it is possible cancel a running stored procedure in SSMS. And doing so will have the same effect of skipping the CATCH block. But cancelling an execution in SSMS still keeps you in the same session, so the transaction is still active until you manually call COMMIT or ROLLBACK, or until you close that query tab (at which point it will tell you that there is an uncommitted transaction and ask if you want to commit it or not; my testing shows that answering "no" will roll it back).

So what to do about app code that is using connection pooling? I am not a huge fan of using SET XACT_ABORT ON;. What I would try first is:

  • Connections can set the max size of the connection pool via the Max Pool Size connection string keyword. If the pool is very large, then it is less likely for any particular connection to be reused quickly. The default pool size appears to be 100. Setting the value to be slightly lower than it is currently (but not too low as to not make effective use of having connection pooling) will help ensure that the connections are reused more quickly will means the cleanup process will be called more quickly which will rollback any open transaction.

  • In your app code where you are executing the query / stored procedure, in the catch block you can call SqlConnection.ClearPool which should close the actual connection at the SQL Server level which should in turn allow the auto-rollback of uncommitted transactions to take place. You might even be able to be fairly targeted with this by checking the exception to see if it is a Command Timeout or one of the other few scenarios that would terminate the process while skipping the T-SQL CATCH block, and if so, then call ClearPool.

  • Technically you could simply disable connection pooling entirely by specifying Pooling=false; in the connection string. However, I do not believe this is necessary and do not recommend it unless either absolutely necessary OR having an application that doesn't generate a lot of connections in the first place.