Sql-server – Generate an increment ID that is unique for a given value of a foreign key

azure-sql-databasesql servert-sql

Disclaimer: This question was asked on SO initially but did not get much traction there, so I am trying here, hoping it will be more interesting to DBAs…


We are working on an ecommerce system where we aggregate orders coming in from different sellers. As you can easily imagine we have an Order table that holds the data for orders of all sellers. Each seller has a unique AccountID which is a foreign key in the Order table.

We want to generate an order number for each order that is coming into the system so that for a given seller (and given AccountID) those order numbers are creating a sequence (first order gets 1, then 2, then 3 etc).

We have tried a couple of solutions to this but they have drawbacks we would like to avoid. All of them are inside a trigger:

ALTER TRIGGER [dbo].[Trigger_Order_UpdateAccountOrderNumber] 
   ON [dbo].[Order]
   AFTER INSERT
BEGIN
     ...
END

Our Solution 1 was:

UPDATE
    [Order]
SET
    AccountOrderNumber = o.AccountOrderNumber
FROM
(
    SELECT
        OrderID,
        AccountOrderNumber =
            ISNULL((SELECT TOP 1 AccountOrderNumber FROM [Order] WHERE AccountID = i.AccountID ORDER BY AccountOrderNumber DESC), 1) +
            (ROW_NUMBER() OVER (PARTITION BY i.AccountID ORDER BY i.OrderID))
    FROM
        inserted AS i
) AS o
WHERE [Order].OrderID = o.OrderID

Note that we have we have READ_COMMITTED_SNAPSHOT ON. It seemed to work well for a while but recently we noticed some duplicate values in the AccountOrderNumber column. After analysing the code it seems logical that duplicates might appear as the operation is not atomic, so if 2 orders are added at the exact same time they will read the same TOP 1 value from the Order table.

After noticing the duplicates we came up with Solution 2 where we have a separate table to track the next AccountOrderNumber for each Account:

DECLARE @NewOrderNumbers TABLE
    (
        AccountID int, 
        OrderID int,
        AccountOrderNumber int
    }

The trigger body is a as follows in that case:

INSERT INTO @NewOrderNumbers (AccountID, OrderID, AccountOrderNumber)
    SELECT
        I.AccountID,
        I.OrderID,
        ASN.Number + (ROW_NUMBER() OVER (PARTITION BY I.AccountID ORDER BY I.OrderID))
    FROM
        inserted AS I
        INNER JOIN AccountSequenceNumber ASN WITH (UPDLOCK) ON I.AccountID = ASN.AccountID AND ASN.AccountSequenceNumberTypeID = @AccountOrderNumberTypeID


    UPDATE [Order] ...

While this solution did not create any duplicates it did cause deadlocks on the newly created @NewOrderNumbers table due to WITH (UPDLOCK). Unfortunately, the locking was necessary to avoid duplicates.

Our latest attempt (Solution 3) is to use sequences. For this we need to create a sequence for each Account in our system and then use it when a new orders are inserted. Here is the code that creates a sequence for AccountID = 1:

CREATE SEQUENCE Seq_Order_AccountOrderNumber_1 AS INT START WITH 1 INCREMENT BY 1 CACHE 100

And the trigger body for AccountID = 1:

    DECLARE @NumbersRangeToAllocate INT = (SELECT COUNT(1) FROM inserted);

    DECLARE @range_first_value_output SQL_VARIANT; 
    EXEC sp_sequence_get_range N'Seq_Order_AccountOrderNumber_1', @range_size = @NumbersRangeToAllocate, @range_first_value = @range_first_value_output OUTPUT; 

    DECLARE @Number INT = CAST(@range_first_value_output AS INT) - 1;
    UPDATE 
        o
    SET 
        @Number = o.AccountOrderNumber = @Number + 1
    FROM 
        dbo.[Order] AS b JOIN inserted AS i on o.OrderID = i.OrderID

The approach with sequences worries us because we expect to have 100K+ accounts in the system pretty soon and for each of those accounts we currently need that sort of incremented ID in 6 different tables. This means we will end up with hundreds of thousands of sequences, which might have a negative impact on performance of the whole DB. We don't know if it will have any impact, but it is close to impossible to find any testimony on the web from people who have used that many sequences in SQL Server.

Finally, the question is: Can you think of a better solution to the problem? It seems like this should be a pretty common use case where you need an ID that is incremented separately for every value of a foreign key. Maybe we are missing something obvious here?

We will also welcome your comments on the 3 solutions detailed above. Maybe one of them is close to being acceptable and just needs some minor tweaking?

Best Answer

It wasn't mentioned, but I would put a UNIQUE constraint/index on the AccountID and Order number to help prevent duplicates (may not be possible if dupes already exist and they can't be cleaned up).

In systems I have worked with, either the ID would be generated during the INSERT operation by getting the current highest value and adding 1 to it, or the use of a sequence table (like you mentioned). I personally didn't like the sequence method; it means relying on a completely unrelated table to generate what is essentially just an incremented number, and there's other ways to do that.

For setting the Order # in code, you have to get the current highest value for Order # for the given AccountID, then add to it. Assuming the AccountID is passed in as a parameter:

INSERT INTO dbo.Orders
(
      AccountID
    , AccountOrderNo
)
SELECT
    AccoutID = @AccountID
    , AccountOrderNo = chk + 1
FROM
    (
        SELECT
            currAccountOrderNo = MAX(AccountOrderNo)
        FROM
            dbo.Orders
        WHERE
            (AccountID = @AccountID)
    ) AS chk
;

I think if you take it out of triggers, you should have less of a problem with duplicates. If you do still get duplicates, the UNIQUE constraint will stop that. I could see a possible option of using the UNIQUE violation error to drive an automatic retry of the code, with a WHILE loop to check for either a status value (good or failed) and a retry attempt (Stop after retrying 5 times). Not sure how necessary this would be.

I think the triggers to try and set the value after it has been inserted are the issue, and if you can move away from there, you'll have better results.