SQL Server 2012 – Is total_elapsed_time in DMV sys.dm_exec_requests Accurate?

dmvsql serversql-server-2012

I am running SQL Server 2012 and am trying to put some queries together for monitoring using the DMVs. However, when looking at the total_elapsed_time field in the sys.dm_exec_requests DMV, the numbers look way off. Here's an example:

SELECT
  session_id, RunTime = CURRENT_TIMESTAMP,
  start_time, total_elapsed_time
FROM sys.dm_exec_requests
WHERE session_id = 284;

session_id  RunTime                 start_time              total_elapsed_time
284         2016-04-07 16:14:03.690 2016-04-07 16:08:14.587 1419976

By my calculations*, the elapsed time should be around around 349,103 — not 1,419,976. That's off by over a factor of 4.

* From the difference, in milliseconds, between the current time and the start_time i.e.
SELECT DATEDIFF(MILLISECOND, '2016-04-07T16:08:14.587', '2016-04-07T16:14:03.690');

Here's the server info:

SELECT @@VERSION;

Microsoft SQL Server 2012 - 11.0.5592.0 (X64) 
    Apr 17 2015 15:18:46 
    Copyright (c) Microsoft Corporation
    Enterprise Edition: Core-based Licensing (64-bit) on Windows NT 6.1 <X64> (Build 7601: Service Pack 1)

Any ideas what could be causing this discrepancy?

Best Answer

There is a bug that aggregates the time in a parallel operation. This is fixed in 2014.

The total_elapsed_time will be correct for a particular parallel query in a batch until it moves on to the next statement in the batch, then the total_elapsed_time will explode by the DOP.

Example

Run this in one query window:

USE AdventureWorks2012
GO
SELECT *
FROM Sales.SalesOrderDetail sod
JOIN Production.Product p ON sod.ProductID = p.ProductID
ORDER BY Style 

waitfor delay '00:00:15'

Run this in a second:

select 
    DATEDIFF(ms, start_time, getdate()) ActualElapsedTime,
    total_elapsed_time from sys.dm_exec_requests
where session_id = <your session_id for the above batch>

The two values will be close to identical until SQL Server moves to the WAITFORDELAY statement, then you should see the total_elapsed_time explode (assuming the first query has a parallel plan as it does on my server).

I remember working on this for a customer a few years ago. Found the bug in the internal database (I am a Microsoft Premier Developer Consultant), but no public reference.