Sql-server – Work queue with complex select and long processing: Ensuring concurrency

concurrencylockingqueuesql serversql-server-2012

I have a queue table from which worker processes claim records, one at a time. It is crucial that no two workers claim same item to process.

Normally this kind of thing is done with select top (1) ... with (updlock, rowlock, readpast).
This works as expected and provides great concurrency when there is a covering index, but for this particular query the where condition is complex, depends on the time of the day and on presence of records in other tables (where exists(...)), and is followed by an order by of similar nastiness, so it cannot be covered by a single index or an indexed view.

In this situation the select top (1) puts the updlock on all rows that match the where, not just the first one, which leads to one worker winning the job and all the others waiting (or falsely reporting that there were no more rows to process).

While this does not lead to any errors, the concurrency is not great. The problem is that record selection is quite cheap, but record processing is very expensive. So the other workers sit there doing nothing for seconds.

In an attempt to fix this concurrency issue I came up with the following solution for the job selection (error handling removed for clarity):

exec sp_getapplock N'Pick a Job', 'Update';

declare @will_lock_that int;
select top (1) @will_lock_that = row_id from the_table with(readpast) where... order by...;

declare @locked int;
select @locked = row_id from the_table with(updlock, rowlock, holdlock) where row_id = @will_lock_that;

exec sp_releaseapplock N'Pick a Job';

The idea was that:

  • The applock makes sure only one process at a time will be picking a job. This is okay because picking a job is cheap.
    The applock will be released very quickly, letting the other workers pick their jobs, while the original worker will be processing the item, having released the applock.
  • The first select statement will have to fetch all rows matching the where condition, as described above, but this time it will not put any persistent locks on them. The shared locks will be released as soon as the select returns.
  • The readpast in that select will skip over any rows already being processed, because they are locked with updlocks.
  • The locking hints in the second select will hide the row from the first, readpast-enabled select performed by others.

Of course this did not work. readpast did not try to skip over the updlocked rows. Instead it would wait for the first encountered locked row to be released. Changing the updlock to xlock in the second query does not change this, neither does making an actual update to the row.
The only way to make the readpast skip over rows seems to be adding a locking hint to the readpast query itself (e.g. with(readpast, updlock)) – but that fails the original purpose, because now all rows will again be locked with updlock.

Is there a way to solve this concurrency problem without using manual locks? (By manual locks I mean updating a special field in the row to 'Locked', outside of a transaction for others to see, and then updating it back when processing is finished.)

I cannot use the locking by deletion, too, because the row must stay in the same table, and only few of its fields will change after processing.

Best Answer

You mentioned you could not use 'locking by deletion' because the row must stay in the table. The code example below does use it, but only after creating a global queue first and populating it. It then uses the 'locking by deletion' for the ##Queue table rather than the table you're working with.

USE master
GO

    -- 1. Create a Stored Procedure that creates the Queue and populates it based on the criteria you have.
    --      (this doesn't have to be a Stored Procedure, but can also be run via some other method to populate the queue.)
    --      (also update 'DatabaseName.SchemaName.the_table' to be the name of your table.)

    IF EXISTS (SELECT * FROM master.dbo.sysobjects o WHERE o.xtype IN ('P') AND o.id = object_id('master.dbo.CreateQueue'))
    DROP PROC CreateQueue
    GO

    CREATE PROCEDURE dbo.CreateQueue
    AS
    BEGIN
        IF OBJECT_ID('Tempdb.dbo.##Queue') IS NOT NULL
        DROP TABLE ##Queue

        SELECT * INTO ##Queue FROM DatabaseName.SchemaName.the_table WHERE <replace with your criteria> 
    END
    GO

    -- 2. Execute the SP above or use the code some other way to create the ##Queue

    -- 3. Copy the code below and run it via multiple processes if you like, as there should not be concurrency issues.
    WHILE 1 = 1
        BEGIN
            DELETE TOP ( 1 )
                    ##Queue WITH ( READPAST )
            OUTPUT  Deleted.*
                    INTO #RowToProcess
            IF @@ROWCOUNT > 0 
                BEGIN
                    --Place logic here to work with the row...
                    DELETE  FROM #RowToProcess
                END
            ELSE
                BREAK
        END