Sql-server – sql server 2014 – Find gaps in sequence on 2 different fields group by third field

gaps-and-islandssql server

I need to detect the gaps between the sequences for each different seller considering that sequence is in 2 different rows considering only supplied range (avoiding numbers below the first record and numbers above the last record):

I have a table with orders where relevant fields are:

    SellerID | OrderID |   OID   |
    seller1  | 123456  |         |
    seller1  | 123457  |         |
    seller1  | 123458  | 123460  |
    seller1  | 123459  | 123460  |
    seller2  | 234567  |         | (first record)
    seller1  | 123455  |         | 
    seller2  | 234568  | 234570  |
    seller2  | 234569  | 234570  | 
    seller1  | 123463  |         |
    seller1  | 123466  |         |
    seller1  | 123453  |         | (first record)
    seller2  | 234572  |         | (last record)
    seller1  | 123470  |         | (last record)

I expect a result like this:

    seller 1 | 123454
    seller 1 | 123461
    seller 1 | 123462
    seller 1 | 123464
    seller 1 | 123465
    seller 1 | 123467
    seller 1 | 123468
    seller 1 | 123469
    seller 2 | 234571

it is quite easy also for me to find gaps in a single row and for single seller:
I used this query that I repeat for every seller:

Declare @SellerID nvarchar(50)='seller1'
SELECT s1.OrderID
FROM Orders s1
LEFT JOIN Orders s2
ON s1.OrderID = s2.OrderID -1
WHERE s2.OrderID IS NULL
and SellerID=@SellerID
order by s1.OrderID asc 

but I do not know how to modify in way to check for the sequence in the 2 columns

moreover, is there a way to achieve this in a single query instead of repeating for each seller?

Best Answer

You can use CROSS APPLY to "unpivot" multiple columns. This query will give you a list of orders (unpivoted). I've added a third column, which is the LEADing (i.e. next) OrderID, if there is any:

SELECT o.SellerID, x.OrderID,
       LEAD(x.OrderID, 1) OVER (
           PARTITION BY o.SellerID
           ORDER BY x.OrderID) AS _nextOrderID
FROM Orders AS o
CROSS APPLY (
    SELECT o.OrderID
    UNION ALL
    SELECT o.OID AS OrderID WHERE o.OID IS NOT NULL
    ) AS x(OrderID);

This gives you:

SellerID   OrderID     _nextOrderID
---------- ----------- ------------
seller1    123453      123455
seller1    123455      123456
seller1    123456      123457
seller1    123457      123458
seller1    123458      123459
seller1    123459      123460
seller1    123460      123460
seller1    123460      123463
seller1    123463      123466
seller1    123466      123470
seller1    123470      NULL
seller2    234567      234568
seller2    234568      234569
seller2    234569      234570
seller2    234570      234570
seller2    234570      234572
seller2    234572      NULL

Now, if we subtract the OrderID from the _nextOrderID, we get the gap (how many orders are "missing"):

SELECT o.SellerID, x.OrderID,
       LEAD(x.OrderID, 1) OVER (
           PARTITION BY o.SellerID
           ORDER BY x.OrderID)-x.OrderID AS _gap
FROM Orders AS o
CROSS APPLY (
    SELECT o.OrderID
    UNION ALL
    SELECT o.OID AS OrderID WHERE o.OID IS NOT NULL
    ) AS x(OrderID);

.. which looks something like this:

SellerID   OrderID     _gap
---------- ----------- -----------
seller1    123453      2
seller1    123455      1
seller1    123456      1
seller1    123457      1
seller1    123458      1
seller1    123459      1
seller1    123460      0    <- 0 because you have two OrderID=123460
seller1    123460      3
seller1    123463      3
seller1    123466      4
seller1    123470      NULL
seller2    234567      1
seller2    234568      1
seller2    234569      1
seller2    234570      0
seller2    234570      2
seller2    234572      NULL

So where _gap<=1, we don't want to return any rows. Where _gap=4, we want to insert 3 rows, from OrderID+1 to OrderID+3. There are a few ways to do this, but I'm going to stick with CROSS APPLY here as well.

I'm putting the query above in a subquery (called sub in my example), and for each row in that result, I'm going to CROSS APPLY any dummy table. If you expect large gaps, you may want to create a separate table with a single IDENTITY column for this purpose, but I'm just going to use the Orders table along with a ROW_NUMBER():

SELECT sub.SellerID, sub.OrderID+n.rownum AS OrderID
FROM (
    SELECT o.SellerID, x.OrderID,
           LEAD(x.OrderID, 1) OVER (
               PARTITION BY o.SellerID
               ORDER BY x.OrderID)-x.OrderID AS _gap
    FROM Orders AS o
    CROSS APPLY (
        SELECT o.OrderID
        UNION ALL
        SELECT o.OID AS OrderID WHERE o.OID IS NOT NULL
        ) AS x(OrderID)
    ) AS sub
CROSS APPLY (
    --- For each row in sub, where _gap>1, return
    --- (_gap) number of rows, starting with 1, 2, 3, ..., (_gap-1).
    SELECT TOP (sub._gap-1)
           ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS rownum
    FROM Orders    -- or any dummy table
    ) AS n
--- Only where there's actually a gap:
WHERE sub._gap>1;

Finally, at the top, we add n.rownum to sub.OrderID to get the missing OrderID.

Here's the final output:

SellerID   OrderID
---------- --------------------
seller1    123454
seller1    123461
seller1    123462
seller1    123464
seller1    123465
seller1    123467
seller1    123468
seller1    123469
seller2    234571