Sql-server – Exclude rows from SELECT

sql server

I have a massive table where a row is defined by two ids (Tid,Bid) that looks like this:

------------------------------------------------------
| Tid | Bid |  Action  | Status     | Value |  Time  |
|  1  |  T  |   Insert |     NULL   |   50  |  10:11 |
|  1  |  T  |   Update |   Executed |   50  |  10:12 |
|  1  |  T  |   Remove |   Executed |   50  |  10:50 |
|  1  |  S  |   Insert |     NULL   |   10  |  10:10 |
|  1  |  S  |   Update |   Executed |   10  |  10:11 |
|  2  |  T  |   Insert |   Executed |   22  |  12:20 |
|  2  |  T  |   Remove |   Executed |   22  |  12:44 |
|  3  |  B  |   Insert |   Executed |   44  |  15:21 |
|  3  |  B  |   Update |   Executed |   48  |  15:25 |
------------------------------------------------------

From this table I want to select only rows with the latest time and have the status Executed and exclude all rows with the same ids if one of those rows contained the Action Remove, so the final resoult should look like this :

------------------------------------------------------
| Tid | Bid |  Action  | Status     | Value |  Time  |
|  1  |  S  |   Update |   Executed |   10  |  10:11 |
|  3  |  B  |   Update |   Executed |   48  |  15:25 |
------------------------------------------------------

Currently my solution is built with multiple nested Selects and it dosent run very fast, is there a more elegant way to solve this?

Best Answer

I think it can be expressed a little more succinctly:

;WITH x AS 
(
  SELECT Tid, Bid, [Action], [Status], Value, [Time], rn = ROW_NUMBER() 
    OVER (PARTITION BY Tid, Bid, [Status] ORDER BY [time] DESC)
  FROM dbo.MassiveTable
)
SELECT Tid, Bid, [Action], [Status], Value, [Time]
FROM x WHERE Status = 'Executed' AND rn = 1
AND NOT EXISTS 
(
  SELECT 1 FROM x AS t
  WHERE t.Tid = x.Tid 
    AND t.Bid = x.Bid
    AND t.[Action] = 'Remove'
);

Note that this assumes that a Remove value anywhere in the timeline makes that Tid, Bid combination invalid. If a Remove is always guaranteed to be last, and the only one in the timeline for that combination (or you don't care about earlier removes as long as they weren't the last entry), you can cheat and get a slightly more efficient plan:

;WITH x AS 
(
  SELECT Tid, Bid, [Action], [Status], Value, [Time], rn = ROW_NUMBER() 
    OVER (PARTITION BY Tid, Bid, [Status] ORDER BY [time] DESC)
  FROM dbo.MassiveTable
  WHERE [Status] = 'Executed'
)
SELECT Tid, Bid, [Action], [Status], Value, [Time]
FROM x WHERE [Action] <> 'Remove' AND rn = 1;

But those assumptions are important. Test all queries with this additional row of sample data:

(3, 'B', 'Remove', 'Executed', 45, '15:23'),