SQL Server – Use Table as GUID Pool Without Race Conditions

blockingsql server

I am in a position where I will be issued with a list of GUIDs to be assigned by a SQL Server database at runtime.

My immediate idea is to create a table for the GUIDs. When a batch of GUIDs is ordered, they will be inserted into the table.

Then, at runtime, a SPROC will be called to pick a GUID from the table. I can either retrieve the next GUID and immediately DELETE it from the table, or retreive the next GUID and UPDATE another column to claim it as used.

But this causes me to wonder about the potential for race conditions where two calls are simultaneously made to the database for a GUID. It seems to me that the same GUID could be retrieved for the two calls if the timing were perfectly inconvenient.

Q1: Am I correct to be concerned about this potential race condition? i.e. can it happen?

Q2: Is there a way to avoid such a condition (e.g. set a SPROC to block on each call?)

One way I can think of which would at least make the race condition fail safe is to set up another table where GUID assignments are made, and set the GUID column to unique. Then, in the runtime process, when a GUID is retrieved it will be assigned to the product via the GUID assignment table. If two products were issued the same GUID, then when it came time to assign them one of the products would fail to assign and be rejected.

Best Answer

Basically you are looking for exclusive access to the next row in a queue table. Assuming this kind of table:

CREATE TABLE dbo.GUIDQueue
(
  RowNumber bigint IDENTITY(1,1) NOT NULL PRIMARY KEY CLUSTERED,
  TheGuid   uniqueidentifier     NOT NULL,
  Claimed   bit                  NOT NULL DEFAULT 0
);

-- insert some n number of rows
DECLARE @n int = 1000;

;WITH src(guid) AS 
(
  SELECT TOP (@n) NEWID() 
    FROM sys.all_objects AS o
    CROSS JOIN sys.all_objects AS s
)
INSERT dbo.GUIDQueue(TheGuid) SELECT guid FROM src;

This procedure will serialize deletes without blocking another session from taking the next value:

ALTER PROCEDURE dbo.PluckFromQueue
  @NextGUID uniqueidentifier OUTPUT
AS
BEGIN
  DECLARE @g TABLE(NextGUID uniqueidentifier);

  ;WITH NextGUID AS 
  (
    SELECT TOP (1) TheGuid
      FROM dbo.GUIDQueue WITH (ROWLOCK, READPAST, UPDLOCK)
      ORDER BY RowNumber
  )
  DELETE NextGUID OUTPUT deleted.TheGUID INTO @g;
  
  SELECT @NextGUID = MAX(NextGUID) FROM @g;
END
GO

Or if you want to update instead:

ALTER PROCEDURE dbo.PluckFromQueue
  @NextGUID uniqueidentifier OUTPUT
AS
BEGIN
  DECLARE @g TABLE(NextGUID uniqueidentifier);

    ;WITH NextGUID AS 
    (
      SELECT TOP (1) RowNumber, TheGuid, Claimed
        FROM dbo.GUIDQueue WITH (ROWLOCK, READPAST, UPDLOCK)
        WHERE Claimed = 0
        ORDER BY RowNumber
    )
    UPDATE NextGUID SET Claimed = 1
      OUTPUT deleted.TheGUID INTO @g;
  
  SELECT @NextGUID = MAX(NextGUID) FROM @g;
END
GO

You would grab the next item off the queue like this:

DECLARE @guid uniqueidentifier;
EXEC dbo.PluckFromQueue @NextGUID = @guid OUTPUT;
SELECT @guid;

And then use @guid for whatever subsequent processing you need to do.

To test that this is serialized, put this in two windows:

BEGIN TRANSACTION;
DECLARE @guid uniqueidentifier;
EXEC dbo.PluckFromQueue @NextGUID = @guid OUTPUT;
SELECT @guid;
-- COMMIT TRANSACTION;

Run one, then the other, both will get a value. Don't forget to commit both.

Remus and gbn talk about why you want those three locking locking hints to the dequeue query.

If you want to observe what happens without those hints (and maybe this is the behavior you want), comment them out, alter the procedure, and repeat the experiment above - you will see that the first blocks the second until the first commits. No matter how close in timing you think you can get, one will win and the other will wait. But don't exacerbate that by wrapping the procedure call in a transaction in either case. The only reason you would want that is if you wanted to "undo" the dequeue operation. If you end up not using the GUID off the queue due to some condition or exception, oh well, consider it a loss (or you could always insert that value back into the queue).

As above, don't forget to commit the second transaction, too.

I wrote about something similar here (the beauty of generating a whole bunch of unique numbers you don't have to confirm are unique later), but in that article I didn't address what to do in case you want to roll back a claim:

Generating Random Numbers in SQL Server Without Collisions

You might consider adding some kind of background process to monitor the size of the queue (or how many unclaimed rows are left, in which case a filtered index might be handy), so that you can automatically replenish with new rows once it gets to a certain point (instead of running out mid-transaction). Make sure your background process counts rows in a smart way.

Related Solutions

Sql-server – SQL Server equivalent to “OVERRIDING USER VALUE”

In SQL Server you actually shouldn't need to make any changes to the outlying code or add an INSTEAD OF trigger to make this work. Here is a quick example, tested on SQL Server 2012, but should work fine on 2005 as well:

CREATE TABLE dbo.source(bar INT, x UNIQUEIDENTIFIER PRIMARY KEY DEFAULT NEWID());
GO
CREATE TABLE dbo.[target](bar INT, x UNIQUEIDENTIFIER PRIMARY KEY);
GO

INSERT dbo.source(bar) SELECT 1;
GO
INSERT dbo.[target] SELECT * FROM dbo.source;
GO

ALTER TABLE dbo.[target] ADD TargetID INT IDENTITY(1,1);
GO

TRUNCATE TABLE dbo.source;
INSERT dbo.source(bar) SELECT 1;
GO
INSERT dbo.[target] SELECT * FROM dbo.source;
GO

SELECT * FROM dbo.[target];

Results:

bar   x                                      TargetID
---   ------------------------------------   --------
1     EFF8DAC4-FB3E-4734-80BE-6DC229846203   1
1     5036688D-C04A-45FC-920E-FF44D7D501D1   2

Now I can also change the primary key and repeat the process:

ALTER TABLE [dbo].[target] DROP CONSTRAINT PK__target__3BD019E50386B4EA;
ALTER TABLE [dbo].[target] ADD CONSTRAINT PK__target__3BD019E50386B4EA  
  PRIMARY KEY (targetID);

TRUNCATE TABLE dbo.source;
INSERT dbo.source(bar) SELECT 1;
GO
INSERT dbo.[target] SELECT * FROM dbo.source;
GO

SELECT * FROM dbo.[target];

Results:

bar   x                                      TargetID
---   ------------------------------------   --------
1     EFF8DAC4-FB3E-4734-80BE-6DC229846203   1
1     5036688D-C04A-45FC-920E-FF44D7D501D1   2
1     41FE97FF-7D45-46EB-8A0D-B2C3BA1E67EA   3

So I haven't had to change my bad code that uses insert/select without any column lists, as long as the source table doesn't also change and assuming that the only change to the target is the addition of an identity column.

PS here is how you can automate the generation of the IDENTITY columns (assuming you will want <tablename>ID):

DECLARE @sql NVARCHAR(MAX);

SET @sql = N'';

SELECT @sql = @sql + '
  ALTER TABLE ' + QUOTENAME(OBJECT_SCHEMA_NAME(t.[object_id]))
  + '.' + QUOTENAME(t.name) + ' ADD ' + t.name + 'ID INT IDENTITY(1,1);'
FROM sys.tables AS t
WHERE name IN (...) -- you will need to fill in this part
AND NOT EXISTS 
(
  SELECT 1 FROM sys.columns WHERE [object_id] = t.[object_id] 
    AND (is_identity = 1 OR name = t.name + 'ID')
);

SELECT @sql;
-- EXEC sp_executesql @sql;

(Note that the SELECT output will show you roughly what the command looks like, but due to output limitations in SSMS and depending on how many tables you have, it won't necessarily show you the full command that will get executed when you uncomment the EXEC.)

And the drop / re-create of the primary keys:

DECLARE @sql NVARCHAR(MAX);

SET @sql = N'';

SELECT @sql = @sql + '
  ALTER TABLE ' + 
  + QUOTENAME(OBJECT_SCHEMA_NAME(t.[object_id]))
  + '.' + QUOTENAME(t.name) + ' DROP CONSTRAINT ' + k.name + ';
  ALTER TABLE ' + 
  + QUOTENAME(OBJECT_SCHEMA_NAME(t.[object_id]))
  + '.' + QUOTENAME(t.name) + ' ADD CONSTRAINT ' 
  + k.name + ' PRIMARY KEY (' + t.name + 'ID);' 
FROM sys.key_constraints AS k
INNER JOIN sys.tables AS t
ON k.parent_object_id = t.[object_id]
WHERE k.[type] = 'PK'
AND t.name IN (...); -- again, you'll want to identify the list of tables

SELECT @sql;
-- EXEC sp_executesql @sql;

And you'll want to do this while the database is in SINGLE_USER mode or while the application(s) are otherwise not able to connect to the database. You'll also want to test all this on a QA or dev system before unleashing any of it on production.

Now, this still isn't exactly best practice - I highly recommend you stop embedding SQL code in your apps, especially SQL code that does insert/select without specifying column lists.

Sql-server – Dissecting Joins

Try this query without the joins first:

select  
 level1.res_name as level1

from resource as level1

where isnull(level1.man_id,0)=0

This returns the rows where man_id isn't referencing anything. In other words, this returns the topmost-level managers only. For your particular example it works like this:

the FROM clause returns the entire table:

level1.Res_id  level1.Res_name  level1.Man_id
-------------  ---------------  -------------
1              sam              3
2              Biju             4
3              adrian           NULL
4              Helen            3
5              Micah            4

the WHERE clause leaves just one row:

level1.Res_id  level1.Res_name  level1.Man_id
-------------  ---------------  -------------
3              adrian           NULL

finally, the SELECT takes from it only one column:
```
level1
------
adrian
```

Adding one self-join like in Chris's query,

select  
 level1.res_name as level1,
 level2.res_name as level2

from resource as level1
left join resource as level2 on level1.res_id=level2.man_id

where isnull(level1.man_id,0)=0

gives you every top-level manager's direct subordinates. More specifically, the join results in the following row set:

level1.Res_id  level1.Res_name  level1.Man_id  level2.Res_id  level2.Res_name  level2.Man_id
-------------  ---------------  -------------  -------------  ---------------  -------------
1              sam              3              NULL           NULL             NULL
2              Biju             4              NULL           NULL             NULL
3              adrian           NULL           1              sam              3
3              adrian           NULL           4              Helen            3
4              Helen            3              2              Biju             4
4              Helen            3              2              Biju             4
5              Micah            4              NULL           NULL             NULL

The WHERE clause filters it down to just this:

level1.Res_id  level1.Res_name  level1.Man_id  level2.Res_id  level2.Res_name  level2.Man_id
-------------  ---------------  -------------  -------------  ---------------  -------------
3              adrian           NULL           1              sam              3
3              adrian           NULL           4              Helen            3

And finally the SELECT clause returns only the names:

level1  level2
------  ------
adrian  sam
adrian  Helen

In the same manner, adding one more join, like this:

select  
 level1.res_name as level1,
 level2.res_name as level2,
 level3.res_name as level3

from resource as level1
left join resource as level2 on level1.res_id=level2.man_id
left join resource as level3 on level2.res_id=level3.man_id

where isnull(level1.man_id,0)=0

brings about the next level of subordinates:

this is the result of joining the third instance of the table to the results of the first join:

1.Res_id  1.Res_name  1.Man_id  2.Res_id  2.Res_name  2.Man_id  3.Res_id  3.Res_name  3.Man_id
--------  ----------  --------  --------  ----------  --------  --------  ----------  --------
1         sam         3         NULL      NULL        NULL      NULL      NULL        NULL
2         Biju        4         NULL      NULL        NULL      NULL      NULL        NULL
3         adrian      NULL      1         sam         3         NULL      NULL        NULL
3         adrian      NULL      4         Helen       3         2         Biju        4
3         adrian      NULL      4         Helen       3         5         Micah       4
4         Helen       3         2         Biju        4         NULL      NULL        NULL
4         Helen       3         5         Micah       4         NULL      NULL        NULL
5         Micah       4         NULL      NULL        NULL      NULL      NULL        NULL

(I've shortened the table aliases for convenience: 1 stands for level1, 2 for level2, and 3 for level3)

this is what remains after the WHERE filter:

1.Res_id  1.Res_name  1.Man_id  2.Res_id  2.Res_name  2.Man_id  3.Res_id  3.Res_name  3.Man_id
--------  ----------  --------  --------  ----------  --------  --------  ----------  --------
3         adrian      NULL      1         sam         3         NULL      NULL        NULL
3         adrian      NULL      4         Helen       3         2         Biju        4
3         adrian      NULL      4         Helen       3         5         Micah       4

and this is what SELECT extracts from the above and gives you back:

level1  level2  level3
------  ------  ------
adrian  sam     NULL
adrian  Helen   Biju
adrian  Helen   Micah

Same continues for the rest of the joins. With your example, however, there will be no more rows, as the hierarchy in the table doesn't go deeper than two levels. Consequently, the other two columns in Chris's original query, level4 and level5, will return NULLs:

level1  level2  level3  level4  level5
------  ------  ------  ------  ------
adrian  sam     NULL    NULL    NULL
adrian  Helen   Biju    NULL    NULL
adrian  Helen   Micah   NULL    NULL

If Chris's assumption that four joins should be enough for most cases does not work for your particular case (and I don't mean the simple example in your question, of course), you can throw in more self-joins using the same pattern.

Best Answer

Related Solutions

Sql-server – SQL Server equivalent to “OVERRIDING USER VALUE”

Sql-server – Dissecting Joins

Related Question