SQL Server Error Handling – Explicit Transaction Left Open After Try/Catch

error handlingsql servert-sqltransaction

I recently wrote a T-SQL script to perform some updates and inserts on 3 different tables. I wanted it done in a single transaction, so I read up on the Microsoft documentation on how to use explicit transactions with try/catch.

According to the documentation, it can be done like this:

BEGIN TRANSACTION;

BEGIN TRY
    -- Generate a constraint violation error.
    DELETE FROM Production.Product
    WHERE ProductID = 980;
END TRY

BEGIN CATCH
    SELECT 
        ERROR_NUMBER() AS ErrorNumber
        ,ERROR_SEVERITY() AS ErrorSeverity
        ,ERROR_STATE() AS ErrorState
        ,ERROR_PROCEDURE() AS ErrorProcedure
        ,ERROR_LINE() AS ErrorLine
        ,ERROR_MESSAGE() AS ErrorMessage;

    IF @@TRANCOUNT > 0
        ROLLBACK TRANSACTION;
END CATCH;

IF @@TRANCOUNT > 0
    COMMIT TRANSACTION;
GO

So I implemented this pattern and put all my updates and inserts into the TRY clause. Problem is, the code managed to leave an open transaction after the query finished, and I do not see how this would happen. Regardless of what work is done inside the TRY-clause, what possible scenarios could cause a query like this to leave a transaction open?

Best Answer

There are a few types of errors that are not caught by the T-SQL TRY / CATCH structure (taken from the MSDN page for TRY...CATCH and slightly edited for clarity):

Errors that have a severity of 20 or higher that stop the SQL Server Database Engine task processing for the session. If an error occurs that has severity of 20 or higher and the database connection is not disrupted, TRY…CATCH will handle the error.

Attentions, such as client-interrupt requests or broken client connections.

When the session is ended by a system administrator by using the KILL statement.

Errors that occur at the same level of execution as the TRY…CATCH construct during statement-level recompilation, such as object name resolution errors that occur after compilation because of deferred name resolution.

For the example code provided in the Question (i.e. a single DML statement), there is a very simple fix: remove the explicit transaction. For a single DML statement an explicit transaction is unnecessary unless there is additional logic that does a check on some condition and might optionally decide to undo that change via a ROLLBACK done in the main TRY and not as part of the CATCH since it wasn't a database engine error. Outside of that particular scenario, I can't see a reason (a good one, at least) to use an explicit transaction for a single DML statement. But I would still keep the TRY / CATCH structure for general error handling.

When it comes to multiple DML statements (which is the true context of the Question as stated in the wording of it above the example code) then you obviously do need an explicit transaction, so here are some thoughts on that:

A comment by @usr on the Question suggests to skip the error handling at the database layer altogether and instead handle it at the app layer. While app code can handle the transactions and error handling, if you are not using an ORM like Entity Framework exclusively (meaning you have stored procedures that are called by SQL Agent jobs and/or by support people via SSMS), then those stored procedures need to include the transaction and error handling because the app layer isn't always the initiator of the execution, yet you still want transaction and error handling in those other situations. I discuss this more in the following answer: Are we required to handle Transaction in C# Code as well as in stored procedure
Uncommitted transactions are automatically rolled-back when the connection ends. This is from SQL Server's perspective, not the client's perspective. Meaning, the connection is no longer listed in sys.dm_exec_connections.
Connection pooling complicates the auto-rollback handling since it, by design, keeps the connection open, even after the client "closes" it. When the connection remains open so that another "connection" attempt can simply re-use it, the session itself is not "cleaned up" until a new connection is opened and a query batch is executed! Upon the first execution submitted by the app code on a "new" connection that is actually reusing a connection from the pool, an internal process labeled as sp_reset_connection is called that will, among other things, roll back an uncommitted transaction in that session that is now being reused. The problem here is when connection pooling is being used, and there is a termination of the process (a Command Timeout perhaps), and no new connection is being requested and no new query is being submitted. In this case the connection just sits in the pool and the session still exists and no cleanup operation is being requested. But this doesn't last forever (though still longer than you want in these cases). According to the MSDN page for SQL Server Connection Pooling (ADO.NET) (in the Removing Connections section):

The connection pooler removes a connection from the pool after it has been idle for approximately 4-8 minutes, or if the pooler detects that the connection with the server has been severed.
Cancelling a stored procedure execution isn't just a matter of your app code getting a CommandTimeout or calling an explicit "cancel". SQL Server Management Studio (SSMS), or any IDE, is also client code. And it is possible cancel a running stored procedure in SSMS. And doing so will have the same effect of skipping the CATCH block. But cancelling an execution in SSMS still keeps you in the same session, so the transaction is still active until you manually call COMMIT or ROLLBACK, or until you close that query tab (at which point it will tell you that there is an uncommitted transaction and ask if you want to commit it or not; my testing shows that answering "no" will roll it back).

So what to do about app code that is using connection pooling? I am not a huge fan of using SET XACT_ABORT ON;. What I would try first is:

Connections can set the max size of the connection pool via the Max Pool Size connection string keyword. If the pool is very large, then it is less likely for any particular connection to be reused quickly. The default pool size appears to be 100. Setting the value to be slightly lower than it is currently (but not too low as to not make effective use of having connection pooling) will help ensure that the connections are reused more quickly will means the cleanup process will be called more quickly which will rollback any open transaction.
In your app code where you are executing the query / stored procedure, in the catch block you can call SqlConnection.ClearPool which should close the actual connection at the SQL Server level which should in turn allow the auto-rollback of uncommitted transactions to take place. You might even be able to be fairly targeted with this by checking the exception to see if it is a Command Timeout or one of the other few scenarios that would terminate the process while skipping the T-SQL CATCH block, and if so, then call ClearPool.
Technically you could simply disable connection pooling entirely by specifying Pooling=false; in the connection string. However, I do not believe this is necessary and do not recommend it unless either absolutely necessary OR having an application that doesn't generate a lot of connections in the first place.

Related Solutions

Sql-server – Transaction and Try-catch in SQL Server Job

I would rather recommend a pattern like the one from Exception Handling and Nested Transactions:

create procedure [usp_my_procedure_name]
as
begin
    set nocount on;
    declare @trancount int;
    set @trancount = @@trancount;
    begin try
        if @trancount = 0
            begin transaction
        else
            save transaction usp_my_procedure_name;

        -- Do the actual work here

lbexit:
        if @trancount = 0   
            commit;
    end try
    begin catch
        declare @error int, @message varchar(4000), @xstate int;
        select @error = ERROR_NUMBER(), @message = ERROR_MESSAGE(), @xstate = XACT_STATE();
        if @xstate = -1
            rollback;
        if @xstate = 1 and @trancount = 0
            rollback
        if @xstate = 1 and @trancount > 0
            rollback transaction usp_my_procedure_name;

        raiserror ('usp_my_procedure_name: %d: %s', 16, 1, @error, @message) ;
    end catch   
end

This pattern checks the XACT_STATE() in the catch block to guard against uncommittable transactions:

Uncommittable Transactions and XACT_STATE
If an error generated in a TRY block causes the state of the current transaction to be invalidated, the transaction is classified as an uncommittable transaction. An error that ordinarily ends a transaction outside a TRY block causes a transaction to enter an uncommittable state when the error occurs inside a TRY block. An uncommittable transaction can only perform read operations or a ROLLBACK TRANSACTION. The transaction cannot execute any Transact-SQL statements that would generate a write operation or a COMMIT TRANSACTION. The XACT_STATE function returns a value of -1 if a transaction has been classified as an uncommittable transaction. When a batch finishes, the Database Engine rolls back any active uncommittable transactions. If no error message was sent when the transaction entered an uncommittable state, when the batch finishes, an error message will be sent to the client application. This indicates that an uncommittable transaction was detected and rolled back.

Your code is checking for @@TRANCOUNT in places where it cannot be 0, it uses a mixture of informational PRINT messages and SELECT result sets for communicating success, it does not handle errors that are recoverable. Ideally the exceptions should propagate to the client, in this case to the Agent job (ie. your catch should re-raise).

Sql-server – connection pooling, transactions, nested transaction and rollback

There is no such thing as nested transactions in any released version of SQL Server. You can say BEGIN TRANSACTION as many times as you want, but a ROLLBACK affects all of them (never mind what @@TRANCOUNT says - it only reflects nesting level).

From the documentation:

It is not legal for the transaction_name parameter of a ROLLBACK TRANSACTION statement to refer to the inner transactions of a set of named nested transactions. transaction_name can refer only to the transaction name of the outermost transaction. If a ROLLBACK TRANSACTION transaction_name statement using the name of the outer transaction is executed at any level of a set of nested transactions, all of the nested transactions are rolled back. If a ROLLBACK WORK or ROLLBACK TRANSACTION statement without a transaction_name parameter is executed at any level of a set of nested transaction, it rolls back all of the nested transactions, including the outermost transaction.

A demonstration:

CREATE TABLE dbo.foo(a INT);

BEGIN TRANSACTION;
  INSERT dbo.foo(a) SELECT 1;
  BEGIN TRANSACTION;
    INSERT dbo.foo(a) SELECT 2;
    BEGIN TRANSACTION;
    ROLLBACK TRANSACTION;

SELECT tc = @@TRANCOUNT;

SELECT a FROM dbo.foo;

DROP TABLE dbo.foo;

Results:

tc    
----
0

1 row(s) affected.

a
----

0 row(s) affected.

So, there is no way to only roll back part of a "nested" transaction - it's all or nothing. If you want previous parts of this transaction to commit even if later parts fail, then stop nesting, and commit the first transaction before starting the next one.

Best Answer

Related Solutions

Sql-server – Transaction and Try-catch in SQL Server Job

Sql-server – connection pooling, transactions, nested transaction and rollback

Related Question