I am not even sure this question is necessary but I am curious to know everyone's thoughts. I have two databases on the same server, dbFoo, dbBar. dbFoo has the following table please note this is a way dumbed down example, and the syntax may not be correct as I am rushing and way more interested in the answer to the underlying issue then the code to do it…
CREATE TABLE dbo.CodeNumbers(
CodeNumbersID INT IDENTITY (1,1) NOT NULL PRIMARY KEY,
CodeValue VARCHAR(30) NOT NULL
IsUsed BIT NOT NULL DEFAULT(0)
);
dbo.CodeNumbers
gets populated with a monthly CSV provided, the import method of your choice is already written to get them in there. We are NEVER given a duplicate code.
Lets assume for arguments sake we have 10,000,000 rows in the table. that all follow this format when imported:
1, 'ajdirjfisofklrlfo039402', 0 all the way till
10000000, 'fkeiir9489', 0
Now in dbBar I have 2 stored procedures the first should access the first un-used code in dbFoo, return it in an out variable and mark it as used. So I have something like:
CREATE PROCEDURE GetNextUseableCode
@CodeOut VARCHAR(30) OUTPUT,
@CID INT OUTPUT
AS
SELECT @CID = CodeNumbersID, @CodeOut = CodeValue
FROM dbFoo.dbo.CodeNumbers
WHERE IsUsed = 0
UPDATE dbFoo.dbo.CodeNumbers
SET IsUsed = 1
WHERE CodeNumbersID = @CID
The code that calls the procedure from dbBar is accessed by 200k sessions a day at various times. When dbFoo.Codes
has no more to return, its fine all is well and good, the application is simply told sorry no more check back tomorrow.
I have 3 main questions..
-
Is there anything special I would need to have in the code to avoid race conditions, and if so what would be best to handle for this without bringing the system to its knees.
-
Is their an efficient way to ensure the next code grabbed whenever the procedure is called, is the next one in chronological order within the ID column.
-
Is there any other concerns that I am not looking at right now that could issue big problems, and what would be an eloquent way to handle this situation?
I understand this is a long pretty open ended question, I have some coded up solutions, but I feel like there are much better ways to get the results I want.
Thanks in advance as always for all help.
Best Answer
There is nothing in your
SELECT
that dictates order at all. It is also not protected from two sessions reading the same row. To see that it is not safe:Create a throwaway table with a key column.
Run this code in a loop from two different sessions:
Check the output for these - they'll happen:
Or you might see deadlocks if you wrap the
SELECT
/UPDATE
in an explicit transaction:To get around this problem, and to ensure that the ID you get is the lowest one available, you could do this:
Note that I added an explicit transaction and also
XLOCK
/HOLDLOCK
hints to prevent two simultaneous sessions from reading the same row. Of course, this has an impact on concurrency (which, unfortunately, is exactly what you want and need here).Other ways to do this include just updating the row and then using a table variable to capture values from the
OUTPUT
clause:Per Paul's update, yes, you could also do this without the table variable:
(Though I am not all that fond of this syntax; not sure why. Might be the same reason I always forget it exists.)
You could change the caller to expect a resultset instead of two output parameters, but that is extra work too. In both cases you'd still need to ensure you got the lowest ID available, which likely means a CTE with a
SELECT
and the same hints. Some discussion here. I also talk about a similar approach in this blog post but I didn't get into anything about concurrency and two sessions trying to delete the same row at the same time. Obviously in that case only one of them can win, but with anUPDATE
they could both manage to be successful (at least in theory).To make things easier, you could relax the restriction that the "next" ID handed out is the lowest ID available. But you still need the isolation through the hints to ensure that two simultaneous sessions don't happen to read the same value (which could happen regardless of ordering). Hopefully with a suitable index these won't destroy concurrency.