SQL Server – Selecting Distinct Rows but Counting All Rows

sql serversql-server-2012

I have the following table:

----------------------------------------------
| ID | interestingData |      timestamp      |
----------------------------------------------
|  1 |       400       | 2016-01-23 17:01:00 |
----------------------------------------------
|  1 |       400       | 2016-01-24 17:01:00 |
----------------------------------------------
|  1 |       350       | 2016-01-25 17:01:00 |
----------------------------------------------
|  2 |       23        | 2016-01-23 17:01:00 | 
----------------------------------------------
|  2 |       34        | 2016-01-24 17:01:00 | 
----------------------------------------------
|  2 |       12        | 2016-01-25 17:01:00 | 
----------------------------------------------

Where our PK is (ID, timestamp). I'm attempting to determine a query that will give me the unique IDs and the latest interestingData for which interestingData exceeds a threshold. That would, of course, be done with:

SELECT DISTINCT ID
FROM table
WHERE interestingData > threshold
ORDER BY timestamp DESC;

However, I want the count of every occurrence where interestingData exceeded the threshold. My results table would ideally look like

------------------------------------------------------
| ID | interestingData | timestamp           | count |    
------------------------------------------------------
|  1 |        350      | 2016-01-25 17:01:00 |   3   |
------------------------------------------------------

Were my threshold 300. I am aware that if you want to pair something distinct with a set of data then a left outer join is going to be in order, but I'm not entirely sure how to go about it. This is the closest I can think of so far.

SELECT DISTINCT ID
FROM table t1
LEFT OUTER JOIN table t2 ON t1.ID = t2.table.ID
WHERE interestingData > 300
ORDER BY timestamp DESC

This gets me the distinct IDs and pairs them with the rest of the data as I need, but no provisions for getting the other parts of the results, let alone the count.

Best Answer

If you want the interestingData and timestamp from the same row (the most recent row that exceeds the threshold), and if you want to include all rows that exceed the threshold even if some rows for that ID don't meet the threshold, then:

;WITH x AS 
(
  SELECT ID, interestingData, [timestamp], 
    [count] = COUNT(1) OVER (PARTITION BY ID),
    rn = ROW_NUMBER() 
      OVER (PARTITION BY ID ORDER BY [timestamp] DESC)
  FROM dbo.tablename
  WHERE interestingData > 300
)
SELECT ID, interestingData, [timestamp], [count]
  FROM x
  WHERE rn = 1;

Also, try to avoid data types and/or reserved keywords as column names. timestamp is not a great choice because (a) it's not very meaningful and (b) it requires square brackets in a lot of scenarios.