Sql-server – Database table as a queue, order by another table’s column

concurrencyindex-tuningqueuesql serversql-server-2008

I have two tables. One table serves as the queue from which multiple users simultaneously dequeue items one at a time, by flagging the item's row. The other table is joined and used for ordering. It is required that no two users will get the same item from the queue and the queue will not block users, so if user a is busy getting the top item, user b should be able to get the next available item.

The problem is, and i can reproduce, simultaneous dequeue will result in exactly one user to succeed; the others are left empty handed. This happens even when there are multiple candidate items in the queue, so the queue is not empty.

If i remove the order by clause, everything works fine, users can simultaneously dequeue, without blocking each other.

How can i solve the problem, i need the ordering and i need the queue to be non-blocking.

Following are the two tables:

CREATE TABLE DBO.QUEUE 
( 
  QUEUEID    INT IDENTITY(1, 1) NOT NULL PRIMARY KEY, 
  SOMEACTION VARCHAR(100)
);
GO 

DECLARE  @counter INT 
SELECT   @counter = 1 
WHILE (@counter <= 10) 
BEGIN 
    INSERT INTO DBO.QUEUE (SOMEACTION) 
    SELECT 'some action ' + CAST(@counter AS VARCHAR);
    SELECT @counter = @counter + 1 
END

CREATE TABLE DBO.QUEUE_ORDERING 
( 
    QUEUEID    INT IDENTITY(1, 1) NOT NULL PRIMARY KEY, 
    ORDERING_COLUMN INT
);
GO 

DECLARE  @counter INT 
SELECT   @counter = 1 
WHILE (@counter <= 10) 
BEGIN 
    INSERT INTO DBO.QUEUE_ORDERING (ORDERING_COLUMN) 
    SELECT @counter;
    SELECT @counter = @counter + 1 
END

And the testing script is as follows , running them from multiple sessions will reproduce my problem:

BEGIN TRAN TRAN1
DECLARE @QUEUEID INT;
;WITH QUEUE_CTE AS 
(
    SELECT   TOP(1) Q.QUEUEID, Q.SOMEACTION
    FROM     DBO.QUEUE Q WITH (ROWLOCK,READPAST, UPDLOCK)
    JOIN     DBO.QUEUE_ORDERING QR WITH (NOLOCK) 
    ON       QR.QUEUEID = Q.QUEUEID
    ORDER BY QR.ORDERING_COLUMN DESC
)

SELECT @QUEUEID = QUEUEID FROM QUEUE_CTE
PRINT 'processing item # ' + CAST(@QUEUEID AS VARCHAR) 

WAITFOR DELAY '00:00:05' 
COMMIT

Edit:

I've traced down the issue to indices. If i order by the column that is part of the clustered index, then only the relevant row is locked. Otherwise, all rows are locked. I still don't have any idea how to resolve the issue. I need the ordering to be on a column that can not be part of the clustered index. The first image shows the problematic one, you can see that all the rows are locked. In the second one, below, only one row gets locked; order by is on the clustered index column.

This is the problematic version, where all the rows are locked

This is the working one, ordering on clustered index column, only one row gets locked

Best Answer

It seems that preventing the locking of candidate rows is impossible with an ordering on a column not part of the clustered index.

I came up with a workaround, that seems to solve the problem on the minimal example i provided. I'll do some tests to make sure if it solves the actual production problem.

The idea is to split dequeue operation into two steps. First step respects the ordering and selects a small candidate set; second step ignores any orderings and works on this small set. Thus preventing locking of all the rows.

  • Select more than one rows to be dequeued, without removing the original order by clause. In my case, i changed the CTE to select top(50) rows. And also changed (ROWLOCK, READPAST, UPDLOCK) to read (NOLOCK) as i don't intend to dequeue all these rows, these are all just candidates for the one row i am willing to dqueue. These rows are stored in a table variable, to be joined in the second step, which follows below.

  • Create a new (second) CTE, this time joining with the table variable (which contains at most 50 rows), where clauses will be repeated here to prevent an unwanted row appearing in the result set. The ordering this time is removed from the new CTE to prevent the locking.

Although not critical for my specific needs, there are a couple of issues with this approach:

  • If more than 50 users simultaneously try to dequeue, only the lucky 50 of them will succeed, while others will not be able to get an item.
  • Granularity of the ordering is lost, each item in the candidate list of 50 are treated equally.
  • Again on the ordering, it might be the case that a user dequeues an item which was not in the top 50, but this won't be a totally off target miss; it will be close to the top 50. This is due to the two step approach: things might change and new items might arrive that have higher priority, from the time first CTE runs until the second CTE which performs the actual update. Anyways, it is highly likely that the item dequeued is within the first 50 items.

Test script for the new approach is as follows:

BEGIN TRAN TRAN1
DECLARE @QUEUEID INT;

DECLARE @CANDIDATE_TABLE TABLE (
    QUEUEID INT NOT NULL
);

;WITH CANDIDATE_CTE AS 
(
    SELECT   TOP(50) Q.QUEUEID
    FROM     DBO.QUEUE Q WITH (NOLOCK)
    JOIN     DBO.QUEUE_ORDERING QR WITH (NOLOCK) 
    ON       QR.QUEUEID = Q.QUEUEID
    ORDER BY QR.ORDERING_COLUMN DESC
)
INSERT INTO @CANDIDATE_TABLE (QUEUEID)
SELECT QUEUEID FROM CANDIDATE_CTE;

;WITH QUEUE_CTE AS
(
    SELECT   TOP(1) Q.QUEUEID
    FROM     DBO.QUEUE Q WITH (ROWLOCK, READPAST, UPDLOCK)
    JOIN     @CANDIDATE_TABLE CT
    ON       CT.QUEUEID = Q.QUEUEID
)
SELECT @QUEUEID = QUEUEID FROM QUEUE_CTE
PRINT 'processing item # ' + CAST(@QUEUEID AS VARCHAR) 

WAITFOR DELAY '00:00:05' 
COMMIT

The outputs of two simultaneous sessions are:

(10 row(s) affected)

processing item # 10


(10 row(s) affected)

processing item # 9