SQL Server – How to Filter Results from a Distinct Count

ctesql serversql-server-2005

I need to modify a procedure to make statistics on my database.

I got a temporary table which contains all the data that I need to make those statistics.
It contains all the relevant actions FormIdNb of the working units NumCavalier in one day :

-- Every action from all units will be stored in this temp table
-- DateHeure = Time of action
-- NumCavalier = ID of working unit
-- FormIdNb = Action type
CREATE TABLE #tmp_DailyMissions (
DateHeure datetime,
NumCavalier  nchar(3),
FormIdNb int
)

INSERT INTO #tmp_DailyMissions
SELECT        DateHeure, Cavalier.NumCavalier, FormIdNb
FROM            ECN4, Cavalier
WHERE        (DateHeure BETWEEN @StartTime AND @EndTime) AND (Cavalier.Terminal = @terminal) AND (Cavalier.NumCavalier = ECN4.NumCavalier)

This temporary table will store all actions that all the working units made in one entire day (just as a logging system).
The significant values for FormIdNb are :

FormIdNb > 3 ==> The working unit was active
FormIdNb = 3 ==> The working unit was idle
FormIdNb < 3 ==> The working unit was inactive

The request that I got right now (Thanks to @AndriyM), returns the number of working units who were ACTIVE, IDLE and INACTIVE on each interval of time of the day (15min) :

WITH time_cte(StartTime, EndTime) AS
  (
    SELECT
      StartTime = @starttime,
      EndTime   = DATEADD(mi, 15, @starttime)

    UNION ALL

    SELECT
      EndTime,
      DATEADD(mi, 15, EndTime)
    FROM
      time_cte
    WHERE
      EndTime < @EndTime
  )
SELECT
  t.StartTime,
  ActiveUnitCount   = COUNT(DISTINCT(CASE WHEN d.FormIdNb > 3 THEN d.NumCavalier END)),
  IdleUnitCount     = COUNT(DISTINCT(CASE WHEN d.FormIdNb = 3 THEN d.NumCavalier END)),
  InactiveUnitCount = COUNT(DISTINCT(CASE WHEN d.FormIdNb < 3 THEN d.NumCavalier END))
FROM
  time_cte AS t
  LEFT OUTER JOIN #tmp_DailyMissions AS d
    ON d.DateHeure >= t.StartTime
    AND d.DateHeure < t.EndTime
GROUP BY
  t.StartTime
;

The problem is, as I said above, that the temporary table I use store the data as a logging system.
This means that it can contain, for a working unit on a same interval of time, any value of FormIdNb.
For example : 1 row FormIdNb = 2, 1 row FormIdNb = 3 and 1 row FormIdNb = 4.

My question is :
How to make a request who get rid of irrelevant values ?

More precisely :
If a working unit was ACTIVE, but was also IDLE and/or INACTIVE on an interval of time, how to "remove" this working unit from the IDLE and INACTIVE counts, and only keep the maximum value of FormIdNb for each intervals of time ?

Best Answer

Agree with Scott Hodgin, you essentially want to consider MAX(FormIdNb) instead of just FormIdNb, and count the results based on whether the MAX result is greater than, equal to or less than 3.

Adding a correlated subquery to the joining condition, as Scott is suggesting, would be one way.

Another would be to use a derived table to first get the max FormIdNb per NumCavalier and range:

SELECT
  t.StartTime,
  d.NumCavalier,
  MaxFormIdNb = MAX(d.FormIdNb)
FROM
  time_cte AS t
  LEFT OUTER JOIN #tmp_DailyMissions AS d
    ON d.DateHeure >= t.StartTime
    AND d.DateHeure < t.EndTime
GROUP BY
  t.StartTime,
  d.NumCavalier

and only then get the counts. This is the whole query:

WITH time_cte(StartTime, EndTime) AS
  (
    SELECT
      StartTime = @starttime,
      EndTime   = DATEADD(mi, 15, @starttime)

    UNION ALL

    SELECT
      EndTime,
      DATEADD(mi, 15, EndTime)
    FROM
      time_cte
    WHERE
      EndTime < @EndTime
  )
SELECT
  StartTime,
  ActiveUnitCount   = COUNT(CASE WHEN MaxFormIdNb > 3 THEN d.NumCavalier END),
  IdleUnitCount     = COUNT(CASE WHEN MaxFormIdNb = 3 THEN d.NumCavalier END),
  InactiveUnitCount = COUNT(CASE WHEN MaxFormIdNb < 3 THEN d.NumCavalier END)
FROM
  (
    SELECT
      t.StartTime,
      d.NumCavalier,
      MaxFormIdNb = MAX(d.FormIdNb)
    FROM
      time_cte AS t
      LEFT OUTER JOIN #tmp_DailyMissions AS d
        ON d.DateHeure >= t.StartTime
        AND d.DateHeure < t.EndTime
    GROUP BY
      t.StartTime,
      d.NumCavalier
  ) AS s
GROUP BY
  t.StartTime
;

Note that it is now just COUNT(...), not COUNT(DISTINCT ...), because the derived table produces only distinct NumCavalier values anyway.