Sql-server – nyway to make this stored procedure thread safe

sql serverstored-procedures

I have a stored procedure that deletes X rows. This procedure is called by a Job Scheduler (Not SQL Server Agent) from one of three Azure Instances. The deletion task is run every 30 minutes from each instance and almost always results in a deadlock. Basically, multiple/simultaneous calls of this same stored procedure keep blocking eachother out.

Any suggestions on how to make this stored procedure threadsafe ?

CREATE PROCEDURE [dbo].[RawData_d_DeleteObsoleteRawData]
         @RemoteReadingMeasureIdList [IntListType] READONLY,
         @MaxSavingDurationInDays INT
AS
BEGIN

         DECLARE @CurrentDate SMALLDATETIME = CAST(GETUTCDATE() as SMALLDATETIME)
         DECLARE @MaxTime SMALLDATETIME
         DECLARE @TotalDeleted INT = 0

         -- Max time for raw data
         SET @MaxTime = DATEADD(DAY, -@MaxSavingDurationInDays, @CurrentDate)

         -- Delete non remote reading raw data
         DELETE TOP (10000) FROM [dbo].[RawData]
         WHERE MeasureId NOT IN (SELECT Item FROM @RemoteReadingMeasureIdList)
         AND DateEnergy < @MaxTime

         SET @TotalDeleted = @TotalDeleted + @@ROWCOUNT

         -- Delete non remote reading raw data (keep 1 raw data per day for remote reading)
         ;WITH cte AS
         (
                 SELECT AreaId, MeasureId, DateEnergy,
                          ROW_NUMBER() OVER (PARTITION BY MeasureId, YEAR(DateEnergy), DAY(DateEnergy) ORDER BY DateEnergy DESC) AS rn
                 FROM [RawData] rd WITH (NOLOCK)
                 INNER JOIN @RemoteReadingMeasureIdList m ON rd.MeasureId = m.Item
                 WHERE DateEnergy < @MaxTime
         )
         DELETE TOP (10000) rd
         FROM [RawData] rd
         INNER JOIN cte ON rd.AreaId = cte.AreaId AND rd.DateEnergy = cte.DateEnergy AND rd.MeasureId = cte.MeasureId
         WHERE rn <> 1

         SET @TotalDeleted = @TotalDeleted + @@ROWCOUNT

         SELECT @TotalDeleted as DeletedRow

Update 26/03/2014

Using the READPAST solution recommended by Jon was the best overall option.

To test I created two new queries, each with their own spid

Spid 51

SET TRANSACTION ISOLATION LEVEL REPEATABLE READ
Begin transaction
Exec [RawData_d_DeleteObsoleteRawData]

Spid 52

SET TRANSACTION ISOLATION LEVEL REPEATABLE READ
Exec [RawData_d_DeleteObsoleteRawData]

The first SPID does not commit and the second one is run immediately after.

Without the READPAST hint, all other executions of RawData_d_DeleteObsoleteRawData never complete.

With the READPAST hint added to the two DELETEs and the SELECT, all other executions of RawData_d_DeleteObsoleteRawData are able to continue.

Thank you again to wBob and Jon Seigel for their assistance

Best Answer

I know raising the transaction isolation level and adding a transaction can seem counter-intuitive, but sometimes, you have to make things queue up and wait their turn. Doing those two things worked for me in my simple test rig. Try it and let us know how you get on.

EDIT: Test rig OK, nice simple test rig using tempdb. It's a bit rough and ready but provided an easy repro and provable fix for me. You should enable SQLCMD mode in SSMS before you run. I basically pit the proc against itself in multiple sessions using either ostress or GO 100. Run the initial version of the proc, monitor for deadlocks; I get many. Then, reset the rig, run in the revised version of the proc and try again.

USE tempdb
GO

------------------------------------------------------------------------------------------------
-- Setup START
------------------------------------------------------------------------------------------------

SET NOCOUNT ON
GO

IF OBJECT_ID('[dbo].[RawData_d_DeleteObsoleteRawData]') IS NOT NULL
DROP PROC [dbo].[RawData_d_DeleteObsoleteRawData]
GO

-- Example table type
IF EXISTS ( SELECT * FROM sys.types st JOIN sys.schemas ss ON st.schema_id = ss.schema_id WHERE st.name = N'IntListType' )
    DROP TYPE IntListType
GO

CREATE TYPE IntListType AS TABLE ( Item INT PRIMARY KEY ) 
GO
-- Create a large table
-- DROP TABLE dbo.rawData
IF OBJECT_ID('dbo.rawData') IS NULL
CREATE TABLE dbo.rawData ( 
    rowId INT IDENTITY PRIMARY KEY,
    AreaId INT, 
    MeasureId INT, 
    someData UNIQUEIDENTIFIER DEFAULT NEWID(), 
    DateEnergy DATETIME DEFAULT GETDATE(), 
    addedBy VARCHAR(30) DEFAULT SUSER_NAME(), 
    ts ROWVERSION 
    )
GO


TRUNCATE TABLE dbo.rawData

-- Populate table
;WITH cte AS (
SELECT TOP 1000 ROW_NUMBER() OVER ( ORDER BY ( SELECT 1 ) ) rn
FROM master.sys.columns c1
    CROSS JOIN master.sys.columns c2
    CROSS JOIN master.sys.columns c3
)
INSERT INTO dbo.rawData ( AreaId, MeasureId, DateEnergy )
SELECT areaId.rn, MeasureId.rn, DATEADD( day, DateEnergy.rn % 300, '1 Jan 2014' )
FROM 
    ( SELECT TOP 10 rn FROM cte  ) areaId,
    ( SELECT TOP 100 rn FROM cte  ) MeasureId,
    ( SELECT TOP 1000 rn FROM cte  ) DateEnergy,
    ( SELECT TOP 2 rn FROM cte  ) doubleUp
GO



-- Original proc
CREATE PROCEDURE [dbo].[RawData_d_DeleteObsoleteRawData]
         @RemoteReadingMeasureIdList [IntListType] READONLY,
         @MaxSavingDurationInDays INT
AS
BEGIN

         DECLARE @CurrentDate SMALLDATETIME = CAST(GETUTCDATE() as SMALLDATETIME)
         DECLARE @MaxTime SMALLDATETIME
         DECLARE @TotalDeleted INT = 0

         -- Max time for raw data
         SET @MaxTime = DATEADD(DAY, -@MaxSavingDurationInDays, @CurrentDate)

         -- Delete non remote reading raw data
         DELETE TOP (10000) FROM [dbo].[RawData]
         WHERE MeasureId NOT IN (SELECT Item FROM @RemoteReadingMeasureIdList)
         AND DateEnergy < @MaxTime

         SET @TotalDeleted = @TotalDeleted + @@ROWCOUNT

         -- Delete non remote reading raw data (keep 1 raw data per day for remote reading)
         ;WITH cte AS
         (
                 SELECT AreaId, MeasureId, DateEnergy,
                          ROW_NUMBER() OVER (PARTITION BY MeasureId, YEAR(DateEnergy), DAY(DateEnergy) ORDER BY DateEnergy DESC) AS rn
                 FROM [RawData] rd WITH (NOLOCK)
                 INNER JOIN @RemoteReadingMeasureIdList m ON rd.MeasureId = m.Item
                 WHERE DateEnergy < @MaxTime
         )
         DELETE TOP (10000) rd
         FROM [RawData] rd
         INNER JOIN cte ON rd.AreaId = cte.AreaId AND rd.DateEnergy = cte.DateEnergy AND rd.MeasureId = cte.MeasureId
         WHERE rn <> 1

         SET @TotalDeleted = @TotalDeleted + @@ROWCOUNT

         SELECT @TotalDeleted as DeletedRow

    END
GO
:exit

-- Setup END
------------------------------------------------------------------------------------------------


------------------------------------------------------------------------------------------------
-- Main proc call START
-- Run this section in two or more connections
-- eg using ostress
-- !!ostress.exe -E -S.\sql2008r2 -dtempdb -i"C:\temp\temp.sql" -n10 -r5 -mstress
------------------------------------------------------------------------------------------------

-- Proc call
SET NOCOUNT ON
DECLARE @tvp AS [IntListType]

INSERT INTO @tvp
VALUES ( 1 ), ( 2 ), ( 3 ), ( 5 ), ( 8 ), ( 13 )

EXEC [dbo].[RawData_d_DeleteObsoleteRawData] @tvp, 30
GO 100

-- Main proc call END
------------------------------------------------------------------------------------------------



------------------------------------------------------------------------------------------------
-- Revised proc START
------------------------------------------------------------------------------------------------

/*  Reset the rig and try with the revised proc 

ALTER PROCEDURE [dbo].[RawData_d_DeleteObsoleteRawData]
         @RemoteReadingMeasureIdList [IntListType] READONLY,
         @MaxSavingDurationInDays INT
AS
BEGIN

         DECLARE @CurrentDate SMALLDATETIME = CAST(GETUTCDATE() as SMALLDATETIME)
         DECLARE @MaxTime SMALLDATETIME
         DECLARE @TotalDeleted INT = 0

         -- Max time for raw data
         SET @MaxTime = DATEADD(DAY, -@MaxSavingDurationInDays, @CurrentDate)

    SET TRANSACTION ISOLATION LEVEL REPEATABLE READ

    BEGIN TRY

        BEGIN TRAN

            -- Delete non remote reading raw data
            DELETE TOP (10000) FROM [dbo].[RawData]
            WHERE MeasureId NOT IN ( SELECT Item FROM @RemoteReadingMeasureIdList )
            AND DateEnergy < @MaxTime

            SET @TotalDeleted = @TotalDeleted + @@ROWCOUNT

            -- Delete non remote reading raw data (keep 1 raw data per day for remote reading)
            ;WITH cte AS
            (
                    SELECT AreaId, MeasureId, DateEnergy,
                            ROW_NUMBER() OVER (PARTITION BY MeasureId, YEAR(DateEnergy), DAY(DateEnergy) ORDER BY DateEnergy DESC) AS rn
                    FROM [RawData] rd
                    INNER JOIN @RemoteReadingMeasureIdList m ON rd.MeasureId = m.Item
                    WHERE DateEnergy < @MaxTime
            )
            DELETE TOP (10000) rd
            FROM [RawData] rd
            INNER JOIN cte ON rd.AreaId = cte.AreaId AND rd.DateEnergy = cte.DateEnergy AND rd.MeasureId = cte.MeasureId
             WHERE rn <> 1

             SET @TotalDeleted = @TotalDeleted + @@ROWCOUNT

             SELECT @TotalDeleted as DeletedRow

             COMMIT TRAN

         END TRY

         BEGIN CATCH

            DECLARE @errorMessage NVARCHAR(2048) = ERROR_MESSAGE()

            IF @@TRANCOUNT > 0 BEGIN PRINT 'Error handler rolling back...' ROLLBACK TRAN END

            RAISERROR( @errorMessage, 16, 1 )

         END CATCH

    END
GO
*/


-- Revised proc END
------------------------------------------------------------------------------------------------