Sql-server – Count Total time difference between selected row

running-totalssql serverwindow functions

My data contains an ID, status and time(hh:mm:ss).

I need to calculate the total time between every 'Location' and 'Out of tunnel' status

Current table

    ID   ||       time       ||     Status
________________________________________________
      1  ||       08:09:14   ||   Out of Tunnel
      2  ||       08:10:59   ||     Location
      3  ||       08:11:42   ||   Out of Tunnel
      4  ||       08:11:55   ||     Location
      5  ||       08:16:36   ||     Location          
      6  ||       09:41:36   ||     Location
      7  ||       09:43:10   ||   Out of Tunnel
      8  ||       09:43:19   ||     Location

The 'Location' status marks where I need to start counting. I need the count to stop the next time the status is 'Out Of Tunnel'. If there is no corresponding 'Out of Tunnel' given, then don't count it.

Each total would then be summed to give a grand total. For example:

Start at ID2: 08:10:59
End at ID3: 08:11:42
Total 1: 00:00:43
Start at ID4: 08:11:42
End at ID7: 09:43:10
Total 2: 01:31:28
Grand Total: Total1 + Total2 = 01:32:11

Desired output

 ______________________
| Total Time in Tunnel |
|______________________|
|                      |
|        01:32:11      |
|______________________|

Best Answer

The second round starting with ID 4 has the time 08:11:55 in your sample data, so if indeed that's the time you're supposed to use as the starting point, the total should amount to 01:31:58. At any rate, here's a solution using the LAG and LEAD window functions. If you want to prevent a sort in the plan, make sure you create the following supporting index:

create unique index idx_time_status on dbo.t1(time, status);

If you're running on 2016+, you can enable batch processing by creating the following dummy index:

create nonclustered columnstore index idx_cs on dbo.t1(id) where id = -1 and id = -2;

Prior to SQL Server 2019, SQL Server would not consider using batch processing if there's no columnstore index present on at least one of the participating tables in the query. Creating this dummy index, even if it's really meaningless in its own right, enables the use of the more optimal batch processing for window functions. Check out the plans both with and without the index. I explain this in detail here.

Here's the solution code:

with c1 as
(
  select *,
    case 
      when status = 'location'
        and lag(status) over(order by time) = 'location' then 'no'
      else 'yes'
    end as keeper
  from dbo.t1
  where status in ('location', 'out of tunnel')
),
c2 as
(
  select *, lead(time) over(order by time) as nxt
  from c1
  where keeper = 'yes'
)
select dateadd(second, sum(datediff(second, time, nxt)), cast('00:00:00' as time(0))) as total
from c2
where status = 'location';

I formatted the output as TIME assuming the total amount of time will be less than 24 hours. If it could be more, you'd just need to add a bit of formatting logic.

Related Solutions

Sql-server – How to calculate time spent at work in a single column

DECLARE @d TABLE(dt DATETIME);

INSERT @d VALUES
('2013-02-22 09:09:00'),
('2013-02-22 13:28:09'),
('2013-02-25 09:43:16'),
('2013-02-25 15:15:19'),
('2013-02-25 19:30:00'),
('2013-02-25 21:30:00'),
('2013-02-26 08:27:54'),
('2013-02-26 16:31:51'),
('2013-02-27 08:52:59'),
('2013-02-27 15:08:35'),
('2013-02-28 08:37:54'),
('2013-02-28 16:37:08'),
('2013-02-28 18:30:00'),
('2013-02-28 20:30:00');

;WITH x AS
(
  SELECT 
   e = dt,
   r = ROW_NUMBER() OVER (ORDER BY dt),
   p = LAG(dt, 1) OVER (ORDER BY dt),
   m = DATEDIFF(MINUTE, LAG(dt, 1) OVER (ORDER BY dt), dt)
 FROM @d
)
SELECT 
  start = p, 
  [end] = e, 
  [minutes_elapsed] = m, 
  [time] = CONVERT(TIME(0), DATEADD(MINUTE, m, 0))
 FROM x
 WHERE r % 2 = 0
 ORDER BY e;

If you need to do this on older versions (2005 -> 2008 R2):

;WITH cte1 AS
(
  SELECT 
   e = dt,
   r = ROW_NUMBER() OVER (ORDER BY dt)
  FROM @d
),
cte2 AS
(
  SELECT 
    x.r,
    [start] = x.e, 
    [end] = y.e, 
    m = DATEDIFF(MINUTE, x.e, y.e)
  FROM cte1 AS x INNER JOIN cte1 AS y
  ON x.r = y.r - 1
  WHERE y.r % 2 = 0
)
SELECT
  [start],
  [end],
  minutes_elapsed = m,
  [time] = CONVERT(CHAR(8), DATEADD(MINUTE, m, 0), 108)
 FROM cte2
 ORDER BY [start];

Suggested schema for recording time in/time out is pretty simple:

CREATE TABLE dbo.Shifts
(
  EmployeeID INT NOT NULL, -- probably Foreign Key, right?
  ClockIn DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP,
  ClockOut DATETIME NULL
);

Now your query is:

SELECT 
  EmployeeID, 
  ClockIn, 
  ClockOut, 
  ShiftLength = CONVERT(TIME(0), DATEADD(MINUTE, 
    DATEDIFF(MINUTE, ClockIn,  
      COALESCE(ClockOut, CURRENT_TIMESTAMP) -- to capture shift in progress
    ), 0))
FROM dbo.Shifts
WHERE ...

Sql-server – the difference between COUNT() and COUNT() OVER()

COUNT(*) Returns all rows ignoring nulls right?

I'm not sure what you mean by "ignoring nulls" here. It returns the number of rows irrespective of any NULLs

SELECT COUNT(*)       
FROM (VALUES (CAST(NULL AS INT)),
             (CAST(NULL AS INT))) V(C)

Returns 2.

Altering the above query to COUNT(C) would return 0 as when using COUNT with an expression other than * only NOT NULL values of that expression are counted.

Suppose the table in your question has the following source data

+---------+---------------+
|  Name   | MaritalStatus |
+---------+---------------+
| Albert  | Single        |
| Bob     | Single        |
| Charles | Single        |
| David   | Single        |
| Edward  | Married       |
| Fred    | Married       |
| George  | NULL          |
+---------+---------------+

The query

SELECT MaritalStatus,
       COUNT(*) AS CountResult
FROM   T
GROUP  BY MaritalStatus

Returns

+---------------+-------------+
| MaritalStatus | CountResult |
+---------------+-------------+
| Single        |           4 |
| Married       |           2 |
| NULL          |           1 |
+---------------+-------------+

Hopefully it is obvious how that result relates to the original data.

What does COUNT(*) OVER() do?

Adding that into the SELECT list for the previous query produces

+---------------+-------------+-----------------+
| MaritalStatus | CountResult | CountOverResult |
+---------------+-------------+-----------------+
| Single        |           4 |               3 |
| Married       |           2 |               3 |
| NULL          |           1 |               3 |
+---------------+-------------+-----------------+

Notice that the result set has 3 rows and CountOverResult is 3. This is not a coincidence.

The reason for this is because it logically operates on the result set after the GROUP BY.

COUNT(*) OVER () is a windowed aggregate. The absence of any PARTITION BY or ORDER BY clause means that the window it operates on is the whole result set.

In the case of the query in your question the value of CountOverResult is the same as the number of distinct MaritalStatus values that exist in the base table because there is one row for each of these in the grouped result.