Sql-server – How to use Count() or Sum() with advanced criteria such as matching the most recent rows where a condition is true

sql server

I have created a table with the following structure:

| ID | RecordDate       | SerialNumber | ProductID | StageID | Result |
|----+------------------+--------------+-----------+---------+--------|
|  0 | 2019-05-23 06:00 | A123456789   |         3 |       6 |      1 |
|  1 | 2019-05-23 06:04 | B123456789   |         4 |       6 |      0 |

It is interesting to be able to generate some basic statistics anchored on the StageID or the ProductID. Like this:

DECLARE @StartDate datetime = '2019-05-10 08:00:00';
DECLARE @EndDate datetime = '2019-05-18 09:00:00';
DECLARE @StageID int = 6;

SELECT ProductID,
    COUNT (*) 'Total',
    SUM (CASE WHEN Result = 1 THEN 1 ELSE 0 END) 'Output'

FROM table1
WHERE RecordDate BETWEEN @StartDate AND @EndDate AND StageID = @StageID
GROUP BY ProductID;

This yields the following typical output:

| ProductID | Total | Output |
|-----------+-------+--------|
|         3 |  4533 |   4211 |
|         4 |  3444 |   3001 |
|           |       |        |

Where:

Total = total number of product processed including passes, fails, duplicates (re-attempts) etc.

Output = count of good product which has been sent on from the process.

Now, one problem is that my count of output isn't very precise. If someone were to process the same product twice, I would count it twice even though the true output stays the same.
I have also counted product whose result was 1, but then it was processed again and the result was zero. I count this even though the true output has actually gone down.

What I really want to do in this scenario is count the most recent record for each SerialNumber where that result was 1. I can get that information with a basic query like this:

SELECT COUNT(*) 
FROM ( SELECT MAX(ID) AS 'ID' 
       FROM table1
       WHERE StageID = @StageID 
         AND Result = 1 
         AND RecordDate BETWEEN @StartDate AND @EndDate 
       GROUP BY SerialNumber ) ID

This query gives me the single result that I want, but I can't figure out how to apply such criteria to the original format so that I get the given table grouped by ProductID. I feel like I want another count() or sum() function with some critieria attached.

How do I do it? Or is there a better way? I am very new to SQL so there could be a basic answer staring me in the face here.

Best Answer

Using a common table expression for Max(RecordDate) might help.

Not sure if this code is just what you need, but you can modify. The script works on the sample table that I created from your question.

DECLARE @StartDate datetime = '2019-05-10 08:00:00';
DECLARE @EndDate datetime = '2019-05-24 09:00:00';
DECLARE @StageID int = 6;

WITH MyRecentSerialNumber AS
    (SELECT ID, MAX(RecordDate) AS MaxRecordDate,   SerialNumber,   ProductID,  StageID,        Result 
    FROM t1
    WHERE Result = 1
    GROUP BY ID, SerialNumber, ProductID, StageID, Result)
SELECT ProductID, COUNT (*) 'Total',
    SUM (CASE WHEN Result = 1 THEN 1 ELSE 0 END) 'Output'
FROM MyRecentSerialNumber
WHERE MaxRecordDate BETWEEN @StartDate AND @EndDate AND StageID = @StageID
GROUP BY ProductID;

Output is the same with the sample data, but you can test and adjust with data that should be excluded.

enter image description here