SQL Server – How to Exclusively Access a Table and Prevent Other Queries

isolation-levelsql servertransaction

I'm stuck at setting transaction isolation level. Here's my scenario that happens in the application:

  1. Get unprocessed messages (using the IsProcessing flag)
  2. Set their IsProcessing to true (in RAM) and update their IsProcessing status
  3. Do the business
  4. Set their IsProcessing to false (in RAM) and update their IsProcessing status

This works fine when runs sequentially. But when I run multiple instances of my application (concurrency), I see that some messages are processed twice or thrice. Here's what happens:

  1. Instance A gets some unprocessed messages
  2. While instance A is setting the IsProcessing to true in RAM, instance B gets some messages, and chances are that it fetches one or more of the messages which are already fetched by instance A

And this is what I've done in hope of preventing it:

  1. Begin transaction (serializable)
  2. Get unprocessed messages (using the IsProcessing flag)
  3. Set their IsProcessing to true (in RAM) and update their IsProcessing status
  4. Commit transaction
  5. Do the business
  6. Set their IsProcessing to false (in RAM)
  7. Begin transaction (serializable)
  8. Update their IsProcessing status
  9. Commit transaction

I don't know why, but during steps 1 to 4, other instances can still perform read queries. This is not desired. I want to exclusively prevent anything, even read queries from being executed on messages table during step 1 to 4.

How can I do that? What am I missing in my design? The goal is to make sure that while a message is queued for processing, no other instance would process it again.

Best Answer

You will be able to avoid a lot of your race conditions by performing many of your steps in a single statement. By using a TOP() cluse will be able to set the flag on at most one row. By using the OUTPUT cluse you can return this to the application automically.

I define a simple test table and populate it:

create table dbo.T(id int, IsProcessing bit default 0);

insert dbo.T(id) values (1),(2),(3),(4);

The output clause needs a table variable to receive the changed values:

declare @U table (id int);

A bit of debug code to make the "before" and "after" states obvious:

select 'Before' as B, * from dbo.T;

And the statement itself:

update top (1) dbo.T
set IsProcessing = 1
OUTPUT INSERTED.id
INTO @U(id)
where IsProcessing = 0;

And the results:

select 'After' as A, * from dbo.T;
select * from @U;

This is the output from the second execution of the above.

B       id  IsProcessing
Before  1   1
Before  2   0
Before  3   0
Before  4   0

A       id  IsProcessing
After   1   1
After   2   1
After   3   0
After   4   0

id 
2

The row with id=2 had gone from IsProcessing=0 to IsProcessing=1 and that id is returned in the table variable.

With this trivial data the rows are processed in the sequence they were created. In a more complex environment the optimizer may choose any row that matches the where clause. If you have a requirement to process rows in, say, time sequence further qualification will be required.

I haven't thought it all the way through but I believe his will work at any isolation level with or without explicit transactions.

It is of course possible that there are no rows with IsProcessing=0 at the time the statement runs. In this case the table variable will have zero rows.

To fully isolate each transaction from the other you could try sp_getapplock. This will add overhead and reduce concurrency. You have to be careful to release the app lock as soon as you can in both success and failure scenarios.