Sql-server – Find continuous records with equal values in each group

sql server

I have a table named "MachineHealth". Here is the sample data:

Date MachineId IsBad
2020-01-05 1 1
2020-01-04 1 1
2020-01-03 1 1
2020-01-02 1 0
2020-01-01 1 1
2020-01-05 2 1
2020-01-04 2 1
2020-01-03 2 0
2020-01-02 2 0
2020-01-01 2 1

I want to find continuous 1 count in field IsBad group by MachineId, that means the number of days that a machine is continuously bad for the most recent time. For example, the query on above data should return:

MachineId ContinuousBadDays
1 3
2 2

How to write this sql?

Best Answer

Assuming your duplicate Dates for MachineId = 1 was an accident (as Scott pointed out) something like this should be what you're after:

WITH CTE_GoodDates_Sorted AS
(
    SELECT MachineId, [Date], ROW_NUMBER() OVER (PARTITION BY MachineId ORDER BY [Date] DESC) AS SortId -- Generates a unique ID for each row within each MachineId, then resets on the next MachineId
    FROM MachineHealth
    WHERE IsBad = 0 -- Only want good Dates so we can get the last good Date per MachineId later
)

SELECT MH.MachineId, SUM(MH.IsBad) AS ContinuousBadDays
FROM MachineHealth AS MH
INNER JOIN CTE_GoodDates_Sorted AS GD
    ON MH.MachineId = GD.MachineId
    AND MH.[Date] > GD.[Date] -- Ensures we only get rows that came after the good Date
WHERE GD.SortId = 1 -- Using the auto-generate unique ID (within each MachineId), ensures we only have the latest good Date for each MachineId to compare to in the previous INNER JOIN
GROUP BY MH.MachineId

This uses the ROW_NUMBER() window function to get the latest Date no accidents occurred on per each MachineId in a CTE then uses that CTE to join to filter down the results to only the most recent continuous bad Dates. The final query then does a GROUP BY on MachineId and sums the IsBad field to get the total days (though you probably can even substitute that for a COUNT(1) instead if your IsBad field at most has a value of 1).