Sql-server – Find continuous records with equal values in each group

sql server

I have a table named "MachineHealth". Here is the sample data:

Date	MachineId	IsBad
2020-01-05	1	1
2020-01-04	1	1
2020-01-03	1	1
2020-01-02	1	0
2020-01-01	1	1
2020-01-05	2	1
2020-01-04	2	1
2020-01-03	2	0
2020-01-02	2	0
2020-01-01	2	1

I want to find continuous 1 count in field IsBad group by MachineId, that means the number of days that a machine is continuously bad for the most recent time. For example, the query on above data should return:

MachineId	ContinuousBadDays
1	3
2	2

How to write this sql?

Best Answer

Assuming your duplicate Dates for MachineId = 1 was an accident (as Scott pointed out) something like this should be what you're after:

WITH CTE_GoodDates_Sorted AS
(
    SELECT MachineId, [Date], ROW_NUMBER() OVER (PARTITION BY MachineId ORDER BY [Date] DESC) AS SortId -- Generates a unique ID for each row within each MachineId, then resets on the next MachineId
    FROM MachineHealth
    WHERE IsBad = 0 -- Only want good Dates so we can get the last good Date per MachineId later
)

SELECT MH.MachineId, SUM(MH.IsBad) AS ContinuousBadDays
FROM MachineHealth AS MH
INNER JOIN CTE_GoodDates_Sorted AS GD
    ON MH.MachineId = GD.MachineId
    AND MH.[Date] > GD.[Date] -- Ensures we only get rows that came after the good Date
WHERE GD.SortId = 1 -- Using the auto-generate unique ID (within each MachineId), ensures we only have the latest good Date for each MachineId to compare to in the previous INNER JOIN
GROUP BY MH.MachineId

This uses the ROW_NUMBER() window function to get the latest Date no accidents occurred on per each MachineId in a CTE then uses that CTE to join to filter down the results to only the most recent continuous bad Dates. The final query then does a GROUP BY on MachineId and sums the IsBad field to get the total days (though you probably can even substitute that for a COUNT(1) instead if your IsBad field at most has a value of 1).

Related Solutions

Sql-server – a good way to query data stored in multiple identical tables split by date/time

Personally, I'd consider using triggers to load a proper single table and query that. Then you have control of partitioning (if needed), defining good indexes etc

Depending on volumes, you could use service broker to decouple the app and your tables.

Basically, anything to avoid massive UNIONs and dynamic SQL...

Sql-server – Select query – group by and sub group by

If I have figured correctly, what you are after:

; WITH cte AS
  ( SELECT pid, uid, event, timeStamp,
           maxts = MAX(timeStamp) OVER (PARTITION BY pid, uid)
    FROM tableX
  )
SELECT pid, cnt = COUNT(*)
FROM cte
WHERE maxts = timeStamp
  AND event = 200 
GROUP BY pid ;

If there is a chance that you have 2 rows with identical pid, uid and timestamp, you'll need to replace COUNT(*) with COUNT(DISTINCT uid).

Best Answer

Related Solutions

Sql-server – a good way to query data stored in multiple identical tables split by date/time

Sql-server – Select query – group by and sub group by

Related Question