Get IO Stats Request SQL Like Activity Monitor – SQL Server

sql serverssmsstatistics

Is it possible to get the same value as in the activity monitor (SSMS) for the "Database IO" as a result of a SQL request?

click to enlarge

Best Answer

You can use a query like this to calculate read/write rates and latency (though due to my laziness these figures are not in units that match Activity Monitor).

SELECT d.name, f.name, f.type_desc, f.physical_name,
 [read b/ms] = num_of_bytes_read * 1.0/sample_ms,
 [avg read latency] =
   (1.0*s.io_stall_read_ms / (COALESCE(NULLIF(s.num_of_reads,0),1))),
 [write b/ms] = num_of_bytes_written * 1.0/sample_ms,
 [avg write latency] = 
   (1.0*s.io_stall_write_ms / (COALESCE(NULLIF(s.num_of_writes,0),1)))
FROM sys.master_files AS f
INNER JOIN sys.databases AS d
ON f.database_id = d.database_id
INNER JOIN sys.dm_io_virtual_file_stats(default, default) AS s
ON d.database_id = s.database_id
AND f.[file_id] = s.[file_id];

See these resources for more info:

If you want to simulate the graph and capture spikes in real time, well, you'll need to build scaffolding to collect snapshots of this (or some other) query, and compare the deltas to generate metrics. This is not something that is trivial to write for you on a Q & A site - people pay big money for high-quality monitoring tools that make this kind of thing easy. :-)

If you want to get an idea of how Activity Monitor does this kind of thing, you can get at the definition for the procedure #am_generate_waitstats with a little effort. I did the heavy lifting for you and here it is in all its unformatted glory - this is how it calculates various wait statistics:

CREATE PROC #am_generate_waitstats AS 
BEGIN

-- Setup query starts a tran -- make sure it wasn't orphaned
WHILE (@@TRANCOUNT > 0) COMMIT TRAN;

-- Get the most recent snapshot ID in the history table ('previous snapshot')
DECLARE @previous_snapshot_id bigint;
DECLARE @previous_collection_time datetime;
SELECT @previous_snapshot_id = ISNULL (MAX (snapshot_id), 0) FROM #am_wait_stats_snapshots;
SELECT TOP 1 @previous_collection_time = ISNULL (collection_time, GETUTCDATE()) FROM #am_wait_stats_snapshots 
WHERE snapshot_id = @previous_snapshot_id;

-- The snapshot we're about to capture ('current snapshot')
DECLARE @current_snapshot_id bigint;
DECLARE @current_collection_time datetime;
SET @current_collection_time = GETUTCDATE();
SET @current_snapshot_id = @previous_snapshot_id + 1;

-- The snapshots table holds the two most recent snapshots. Delete the older of the two to 
-- make room for a new snapshot. 
DELETE FROM #am_wait_stats_snapshots WHERE snapshot_id < @previous_snapshot_id;
DELETE FROM #am_resource_mon_snap WHERE current_snapshot_id < @previous_snapshot_id;

DECLARE @interval_sec numeric (20, 4);
SET @interval_sec = 
    CASE 
        -- Avoid int overflow that DATEDIFF (ms, ...) can cause in the event of a huge gap between intervals
        WHEN DATEDIFF (second, @previous_collection_time, @current_collection_time) >= 10 
            THEN DATEDIFF (second, @previous_collection_time, @current_collection_time)
        -- Avoid divide-by-zero
        WHEN DATEDIFF (millisecond, @previous_collection_time, @current_collection_time) = 0.0 
            THEN 0.0001
        ELSE DATEDIFF (millisecond, @previous_collection_time, @current_collection_time) / 1000.0
    END;

-- This query captures in-progress and completed (cumulative) wait time for each wait type
INSERT INTO #am_wait_stats_snapshots 
SELECT 
    @previous_snapshot_id + 1 AS snapshot_id, 
    @current_collection_time AS collection_time, 
    wait_type, 
    SUM (waiting_tasks_count) AS waiting_tasks_count, 
    SUM (signal_wait_time_ms) AS signal_wait_time_ms, 
    SUM (wait_time_ms) AS wait_time_ms, 
    SUM (raw_wait_time_ms) AS raw_wait_time_ms
FROM 
(
    -- global server wait stats (completed waits only)
    SELECT 
        wait_type, 
        waiting_tasks_count, 
        (wait_time_ms - signal_wait_time_ms) AS wait_time_ms,  
        signal_wait_time_ms, 
        wait_time_ms AS raw_wait_time_ms
    FROM sys.dm_os_wait_stats
    WHERE waiting_tasks_count > 0 OR wait_time_ms > 0 OR signal_wait_time_ms > 0
    UNION ALL 
    -- threads in an in-progress wait (not yet completed waits)
    SELECT 
        wait_type, 
        1 AS waiting_tasks_count, 
        wait_duration_ms AS wait_time_ms, 
        0 AS signal_wait_time_ms, 
        wait_duration_ms AS raw_wait_time_ms
    FROM sys.dm_os_waiting_tasks
    -- Very brief waits quickly will roll into dm_os_wait_stats; we only need to 
    -- query dm_os_waiting_tasks to handle longer-lived waits. 
    WHERE wait_duration_ms > 1000
) AS merged_wait_stats
GROUP BY merged_wait_stats.wait_type;

/*
Now join the current snapshot to the prior snapshot to calculate the wait time for 
the just-completed time interval. 

The previous_snapshot derived table represents cumulative wait stats at the beginning 
of the just-completed time interval, while current_snapshot is wait stats at the end of 
the interval. By subtracting the start cumulative wait time from the end cumulative 
wait time, we can calculate the wait time that accumulated during this time interval. 

The query uses a CTE to expose the interval-specific stats that are the result of the 
comparison of the current snapshot to the previous snapshot. 

Notes on the calculation of the [weighted_average_wait_time_per_sec] column: 
-------------------------
We use a weighted average formula to provide the user with a more stable indicator of 
the recent waittime. This is primarily useful if the user has selected a rapid refresh 
rate.  With a sample interval of 1 second, for example, it could be hard to assess the 
overall bottleneck on the system if short-term fluxuations in waittime distribution 
were creating constant changes in the 'worst' wait category for the just-completed 1 
second sample interval. 

The goal of this is to make the average of (approximately) the past 30-60 seconds 
dominate the value.  The longer the sample interval, the greater the weight that is 
given to the just-completed sample interval. With a 1-second sample interval, the 
just-completed sample is given a weight of about 10%.  With a 5 second sample 
interval, it is given a weight of about 50%.  Sample intervals longer than 5 seconds 
quickly approach 100% (a 60 second interval is weighted as ~80%, and a 10 minute 
interval at ~95%).  

The formula for the weight W of the current interval is below (T is the duration of 
the just-completed time interval, in seconds): 

    W  =  (1 - (T / (T^1.4 + 0.1)))

You can see the curve of the weight function by running this TSQL: 
    DECLARE @interval_sec numeric(20, 2)
    DECLARE @current_interval_weight numeric (20, 2)
    SET @interval_sec = 1
    PRINT CONVERT (char(20), 'Interval (sec)') + 'Interval Weight'
    PRINT CONVERT (char(20), '==============') + '==============='
    WHILE (@interval_sec <= 6000)
    BEGIN
        SET @current_interval_weight = 1 - (@interval_sec / (POWER (@interval_sec, 1.4) + 0.1))
        PRINT CONVERT (char(20), @interval_sec) + CONVERT (char(20), @current_interval_weight) 
            + '  ' + REPLICATE ('-', @current_interval_weight * 100)
        SET @interval_sec = CASE WHEN @interval_sec = 1 THEN 5 ELSE @interval_sec + 5 END
    END

The wait time for the just-completed interval is combined with the previous weighted 
average in the following way, where P is the previous sample's weighted average wait 
time, C is the current interval's wait time, and W is @current_interval_weight (the % 
weight given to the current waittime, using the prior formula): 

    ((1-W) * P) + (W * C)

*/

DECLARE @current_interval_weight numeric (10,6);
-- SET @current_interval_weight = @interval_sec / 60;
SET @current_interval_weight = 1 - (@interval_sec / (POWER (@interval_sec, 1.4) + 0.1))
IF @current_interval_weight > 1 SET @current_interval_weight = 1;

WITH interval_waitstats AS 
(
    -- First get resource wait stats for this interval. 
    SELECT 
        #am_wait_types.category_name, 
        current_snapshot.wait_type, 
        -- All wait stats will be reset to zero by a service cycle, which will cause 
        -- (snapshot2waittime-snapshot1waittime) calculations to produce an incorrect 
        -- negative wait time for the interval.  Detect this and avoid calculating 
        -- negative wait time/wait count/signal time deltas. 
        CASE 
            WHEN (current_snapshot.waiting_tasks_count - previous_snapshot.waiting_tasks_count) < 0 THEN current_snapshot.waiting_tasks_count 
            ELSE (current_snapshot.waiting_tasks_count - previous_snapshot.waiting_tasks_count) 
        END AS interval_waiting_tasks_count, 
        -- Use [raw_wait_time_ms] for this check because [wait_time_ms] is a calculated 
        -- value and tiny differences in sample time for signal wait time vs. wait time 
        -- can cause its calculated value to be slightly negative even when stats weren't 
        -- reset. 
        CASE 
            WHEN (current_snapshot.raw_wait_time_ms - previous_snapshot.raw_wait_time_ms) < 0 THEN current_snapshot.wait_time_ms
            ELSE (current_snapshot.wait_time_ms - previous_snapshot.wait_time_ms)
        END AS interval_resource_wait_time, 
        CASE 
            WHEN (current_snapshot.signal_wait_time_ms - previous_snapshot.signal_wait_time_ms) < 0 THEN current_snapshot.signal_wait_time_ms 
            ELSE (current_snapshot.signal_wait_time_ms - previous_snapshot.signal_wait_time_ms) 
        END AS interval_resource_signal_time, 
        current_snapshot.wait_time_ms AS resource_wait_time_cumulative 
    FROM (
            SELECT * FROM #am_wait_stats_snapshots WHERE snapshot_id = @previous_snapshot_id
        ) AS previous_snapshot
    INNER JOIN (
            SELECT * FROM #am_wait_stats_snapshots WHERE snapshot_id = @current_snapshot_id
        ) AS current_snapshot
        ON previous_snapshot.wait_type = current_snapshot.wait_type 
    INNER JOIN #am_wait_types ON #am_wait_types.wait_type = current_snapshot.wait_type
    WHERE #am_wait_types.ignore != 1 AND #am_wait_types.category_name != 'Idle'
)
INSERT INTO #am_resource_mon_snap
SELECT 
    @previous_snapshot_id AS previous_snapshot_id, 
    @current_snapshot_id AS current_snapshot_id, 
    @previous_collection_time AS previous_collection_time, 
    @current_collection_time AS current_collection_time, 
    @interval_sec AS interval_sec, 
    interval_waitstats.category_name, 
    interval_waitstats.wait_type, 
    interval_waitstats.interval_waiting_tasks_count, 
    -- Tiny differences in sample time for signal wait time vs. wait time can cause our 
    -- calculated wait time to be slightly negative
    CASE 
        WHEN interval_waitstats.interval_resource_wait_time < 0 THEN 0
        ELSE interval_waitstats.interval_resource_wait_time
    END AS interval_resource_wait_time, 
    interval_waitstats.interval_resource_signal_time, 
    CONVERT (bigint, interval_waitstats.interval_resource_wait_time / @interval_sec) AS interval_wait_time_per_sec, 
    CONVERT (numeric (10, 2), interval_waitstats.interval_resource_wait_time / @interval_sec / 1000) AS interval_avg_waiter_count, 
    interval_waitstats.resource_wait_time_cumulative, 
    CASE 
        WHEN previous_interval_waitstats.weighted_average_wait_time_per_sec IS NULL 
        THEN CONVERT (bigint, interval_waitstats.interval_resource_wait_time / @interval_sec)
        ELSE
            -- weighted average formula -- see comment above for explanation
            ((1-@current_interval_weight) * previous_interval_waitstats.weighted_average_wait_time_per_sec) 
                + (@current_interval_weight * CONVERT (bigint, interval_waitstats.interval_resource_wait_time / @interval_sec))
    END AS weighted_average_wait_time_per_sec
FROM interval_waitstats 
LEFT OUTER JOIN (
    SELECT * FROM #am_resource_mon_snap 
    WHERE current_snapshot_id = @previous_snapshot_id
) AS previous_interval_waitstats ON interval_waitstats.wait_type = previous_interval_waitstats.wait_type;

-- We can delete the data for the previous snapshot now
DELETE FROM #am_resource_mon_snap WHERE current_snapshot_id < @current_snapshot_id;
END

For file I/O specifically, it is not so cut and tried. With a trace you can uncover code snippets like these, but I have yet to find what ties them together, or even where the #temp tables are defined - I suspect some of the math and other work is done in the application and/or in other queries I'm not capturing in my trace. Needless to say, this is not just a simple SQL query.

Batch #1:

SET NOCOUNT ON;

DECLARE @previous_collection_time datetime;
DECLARE @previous_total_io_mb numeric (28, 1);
DECLARE @current_collection_time datetime;
DECLARE @current_total_io_mb numeric (28, 1);
DECLARE @mb_per_sec numeric (20, 1);

-- Get the previous snapshot's total I/O
SELECT TOP 1 @previous_collection_time = collection_time, @previous_total_io_mb = total_io_bytes 
FROM #am_dbfileio
ORDER BY collection_time DESC;

-- Get the current total I/O.  
SET @current_collection_time = GETDATE();
SELECT @current_total_io_mb = SUM(num_of_bytes_read + num_of_bytes_written) / 1024.0 / 1024.0
FROM sys.dm_io_virtual_file_stats(default, default);

-- Calc the total I/O rate (MB/sec) for the just-completed time interval. 
-- Round values larger than 2MB/sec to the nearest MB.
SET @mb_per_sec = (@current_total_io_mb - @previous_total_io_mb) / DATEDIFF (millisecond, @previous_collection_time, @current_collection_time) * 1000;
IF @mb_per_sec > 2 
BEGIN
  SET @mb_per_sec = ROUND (@mb_per_sec, 0);
END; 

-- Save off current total I/O
INSERT INTO #am_dbfileio (collection_time, total_io_bytes) 
VALUES (@current_collection_time, @current_total_io_mb);

-- Return the I/O rate for the just-completed time interval. 
SELECT ISNULL (@mb_per_sec, 0) AS mb_per_sec;

-- Get rid of all but the most recent snapshot's data
DELETE FROM #am_dbfileio WHERE collection_time < @current_collection_time;

DECLARE @current_collection_time datetime;
SET @current_collection_time = GETDATE();

-- Grab a snapshot
INSERT INTO #am_dbfilestats
SELECT 
    @current_collection_time AS collection_time, 
    d.name AS [Database], 
    f.physical_name AS [File], 
    (fs.num_of_bytes_read / 1024.0 / 1024.0) [Total MB Read], 
    (fs.num_of_bytes_written / 1024.0 / 1024.0) AS [Total MB Written], 
    (fs.num_of_reads + fs.num_of_writes) AS [Total I/O Count], 
    fs.io_stall AS [Total I/O Wait Time (ms)], 
    fs.size_on_disk_bytes / 1024 / 1024 AS [Size (MB)]
FROM sys.dm_io_virtual_file_stats(default, default) AS fs
INNER JOIN sys.master_files f ON fs.database_id = f.database_id AND fs.file_id = f.file_id
INNER JOIN sys.databases d ON d.database_id = fs.database_id; 

-- Get the timestamp of the previous collection time
DECLARE @previous_collection_time datetime;
SELECT TOP 1 @previous_collection_time = collection_time 
FROM #am_dbfilestats 
WHERE collection_time < @current_collection_time
ORDER BY collection_time DESC;

DECLARE @interval_ms int;
SET @interval_ms = DATEDIFF (millisecond, @previous_collection_time, @current_collection_time); 

-- Return the diff of this snapshot and last
SELECT 
    cur.[Database], 
    cur.[File] AS [File Name], 
    CONVERT (numeric(28,1), (cur.[Total MB Read] - prev.[Total MB Read]) * 1000 / @interval_ms) AS [MB/sec Read], 
    CONVERT (numeric(28,1), (cur.[Total MB Written] - prev.[Total MB Written]) * 1000 / @interval_ms) AS [MB/sec Written], 
    -- protect from div-by-zero
    CASE 
        WHEN (cur.[Total I/O Count] - prev.[Total I/O Count]) = 0 THEN 0
        ELSE
            (cur.[Total I/O Wait Time (ms)] - prev.[Total I/O Wait Time (ms)]) 
                / (cur.[Total I/O Count] - prev.[Total I/O Count])
    END AS [Response Time (ms)]
FROM #am_dbfilestats AS cur
INNER JOIN #am_dbfilestats AS prev ON prev.[Database] = cur.[Database] AND prev.[File] = cur.[File]
WHERE cur.collection_time = @current_collection_time 
    AND prev.collection_time = @previous_collection_time;

-- Delete the older snapshot
DELETE FROM #am_dbfilestats
WHERE collection_time != @current_collection_time;

Batch #2:

DECLARE @current_collection_time datetime;
SET @current_collection_time = GETDATE();

-- Grab a snapshot
INSERT INTO #am_dbfilestats
SELECT 
    @current_collection_time AS collection_time, 
    d.name AS [Database], 
    f.physical_name AS [File], 
    (fs.num_of_bytes_read / 1024.0 / 1024.0) [Total MB Read], 
    (fs.num_of_bytes_written / 1024.0 / 1024.0) AS [Total MB Written], 
    (fs.num_of_reads + fs.num_of_writes) AS [Total I/O Count], 
    fs.io_stall AS [Total I/O Wait Time (ms)], 
    fs.size_on_disk_bytes / 1024 / 1024 AS [Size (MB)]
FROM sys.dm_io_virtual_file_stats(default, default) AS fs
INNER JOIN sys.master_files f ON fs.database_id = f.database_id AND fs.file_id = f.file_id
INNER JOIN sys.databases d ON d.database_id = fs.database_id; 

-- Get the timestamp of the previous collection time
DECLARE @previous_collection_time datetime;
SELECT TOP 1 @previous_collection_time = collection_time 
FROM #am_dbfilestats 
WHERE collection_time < @current_collection_time
ORDER BY collection_time DESC;

DECLARE @interval_ms int;
SET @interval_ms = DATEDIFF (millisecond, @previous_collection_time, @current_collection_time); 

-- Return the diff of this snapshot and last
SELECT 
    cur.[Database], 
    cur.[File] AS [File Name], 
    CONVERT (numeric(28,1), (cur.[Total MB Read] - prev.[Total MB Read]) * 1000 / @interval_ms) AS [MB/sec Read], 
    CONVERT (numeric(28,1), (cur.[Total MB Written] - prev.[Total MB Written]) * 1000 / @interval_ms) AS [MB/sec Written], 
    -- protect from div-by-zero
    CASE 
        WHEN (cur.[Total I/O Count] - prev.[Total I/O Count]) = 0 THEN 0
        ELSE
            (cur.[Total I/O Wait Time (ms)] - prev.[Total I/O Wait Time (ms)]) 
                / (cur.[Total I/O Count] - prev.[Total I/O Count])
    END AS [Response Time (ms)]
FROM #am_dbfilestats AS cur
INNER JOIN #am_dbfilestats AS prev ON prev.[Database] = cur.[Database] AND prev.[File] = cur.[File]
WHERE cur.collection_time = @current_collection_time 
    AND prev.collection_time = @previous_collection_time;

-- Delete the older snapshot
DELETE FROM #am_dbfilestats
WHERE collection_time != @current_collection_time;

So, if you'd like to work from there, that's probably a good starting point. Personally, since I know there are already plenty of tools that do what I'm trying to do, my time is better spent elsewhere, and I prefer to not re-invent the wheel. YMMV.

Best Answer

Related Solutions

Sql-server – Greyed out menu options, broken activity monitor

SQL Server – Check for Opened Activity Monitor

Related Question