Sql-server – SQL Server DMV to show if database is used

dmvsql server

What is the best indicator in finding out whether a database is utilized or accessed?

I tried index statistics and based on I/O utilization but none seem to be a good indicator.

Problem: I have a lot of databases on a development environment and due to space issues we are trying to decommission those that are not used without actually messing with the ones that are really needed.

Best Answer

SQL Server doesn't really track database access in the way you want, at least going backward (you can set up things like server-side trace, extended events, auditing, etc. going forward).

There is one ballpark thing you can use: DMVs that track index usage and procedure/trigger/query stats. For example:

;WITH d AS
(
  SELECT d = database_id, name FROM sys.databases
  WHERE state = 0 AND database_id BETWEEN 5 AND 32766
),
index_usage(d,lsk,lsc,llk,lupd) AS
(
  SELECT database_id, MAX(last_user_seek), MAX(last_user_scan),
    MAX(last_user_lookup), MAX(last_user_update)
  FROM sys.dm_db_index_usage_stats
  WHERE database_id BETWEEN 5 AND 32766
  GROUP BY database_id
),
proc_stats(d,lproc) AS
(
  SELECT database_id, MAX(last_execution_time) 
    FROM sys.dm_exec_procedure_stats
    WHERE database_id BETWEEN 5 AND 32766
    GROUP BY database_id
),
trig_stats(d,ltrig) AS
(
  SELECT database_id, MAX(last_execution_time)
    FROM sys.dm_exec_trigger_stats
    WHERE database_id BETWEEN 5 AND 32766
    GROUP BY database_id
),
query_stats(d,lquery) AS
(
  SELECT t.[dbid], MAX(s.last_execution_time) 
    FROM sys.dm_exec_query_stats AS s
    CROSS APPLY sys.dm_exec_sql_text(s.plan_handle) AS t
    WHERE t.[dbid] BETWEEN 5 AND 32766
    GROUP BY t.[dbid]
)
SELECT d.name,i.lsk,i.lsc,i.llk,i.lupd,p.lproc,t.ltrig,q.lquery
FROM d LEFT OUTER JOIN index_usage AS i ON d.d = i.d
LEFT OUTER JOIN proc_stats  AS p ON d.d = p.d
LEFT OUTER JOIN trig_stats  AS t ON d.d = t.d
LEFT OUTER JOIN query_stats AS q ON d.d = q.d;

Note that these statistics aren't completely reliable, since you may not have any stored procedures, and the queries found in sys.dm_exec_query_stats may reference more than one database, and may not ever reflect the one you are concerned about.

Also, they are reset when SQL Server is restarted, or a database is detached / attached or restored, or when a database is auto-closed, and can also depend in some cases on the plans still being in the cache (which another database could completely take over within minutes). So if you are looking into the past, unless you know none of these things have happened for an entire business cycle, I wouldn't rely on just these numbers to determine whether a database is used (there could also be automated processes that are making a database look current, even if you don't care that those automated processes will fail when you remove the database).

Another note is that certain index access may not be tracked in the index usage views; for example, in SQL Server 2014, which adds memory-optimized tables, activity against those hash indexes are not captured this way (and the views where you think the activity would be captured, like sys.dm_db_xtp_hash_index_stats, do not include any date/time columns). If you are using SQL Server 2014 and in-memory OLTP ("Hekaton"), you may want to add some research to cover those objects (in case they are the only ones being referenced in a database).

And one more note is that the queries captured by sys.dm_exec_query_stats could be false positive. For example, if your database has filestream/filetable, you will see these queries being run by the system occasionally:

select table_id, item_guid, oplsn_fseqno, oplsn_bOffset, oplsn_slotid
from [database].[sys].[filetable_updates_<some_id>] with (readpast) order by table_id

So, you may want to add additional filtering to the above query to filter those out (as long as the filter doesn't accidentally filter out queries you do care about). This is probably a safe addition to that derived table:

AND t.[text] NOT LIKE N'%oplsn_fseqno%'

In the end, the safest thing to do in a dev environment is to take the databases you're not sure about offline for a week. If nobody complains, back them up, and drop them. If it takes more than a week for someone to notice they're missing, you can always restore them (there or elsewhere).

I've blogged about this a bit as well:

https://sqlblog.org/2008/05/06/when-was-my-database-table-last-accessed

Related Solutions

Sql-server – Track deleted rows/records

It sounds from your comment on Marlon's answer like these are tables whose structure changes frequently. That could make the trigger solution unwieldy.

In that case, investigate the alternative of doing a Transactional Replication publication and excluding DELETE transactions from the publication. INSERT, UPDATE, and DDL changes would be replicated to your destination table. For SQL 2005, this is the optimal solution.

If you were to upgrade to SQL 2008, you could make use of Change Data Capture. That would be the most robust solution, and easier to maintain than the Replication method. But I know getting the approval/budget/time for such upgrades is not always easy. :)

Sql-server – SQL Server 2008 R2 DMV Question

use dbx;
select foo
from db1.dbo.table
join db2.dbo.table on condition
where some_function();

This query consumed lots of CPU and requested a large memory grant. In which database? The information you want simply doesn't exist as a concept. As a human with insight knowledge and with hindsight benefit, you probably would be able to explain why 75% of CPU is due to db1 and 15% is due to db2. But ultimately you just can assign queries to a database. The fact that some (ok, most) queries are 100% contained inside a db does not mean that all query resources can be assigned deterministically to a db.

However, for practical means is relatively simple to automate exactly what you did in your post: inspect the plans and identify all physical access operators locations and use this info to assign the query resources to a DB.

with xmlnamespaces (default 'http://schemas.microsoft.com/sqlserver/2004/07/showplan')
select x.value(N'@NodeId',N'int') as NodeId
    , x.value(N'@PhysicalOp', N'sysname') as PhysicalOp
    , x.value(N'@LogicalOp', N'sysname') as LogicalOp
    , ox.value(N'@Database',N'sysname') as [Database]
    , ox.value(N'@Schema',N'sysname') as [Schema]
    , ox.value(N'@Table',N'sysname') as [Table]
    , ox.value(N'@Index',N'sysname') as [Index]
    , ox.value(N'@IndexKind',N'sysname') as [IndexKind]
    , x.value(N'@EstimateRows', N'float') as EstimateRows
    , x.value(N'@EstimateIO', N'float') as EstimateIO
    , x.value(N'@EstimateCPU', N'float') as EstimateCPU
    , x.value(N'@AvgRowSize', N'float') as AvgRowSize
    , x.value(N'@TableCardinality', N'float') as TableCardinality
    , x.value(N'@EstimatedTotalSubtreeCost', N'float') as EstimatedTotalSubtreeCost
    , x.value(N'@Parallel', N'tinyint') as DOP
    , x.value(N'@EstimateRebinds', N'float') as EstimateRebinds
    , x.value(N'@EstimateRewinds', N'float') as EstimateRewinds
    , st.*
    , pl.query_plan
from sys.dm_exec_query_stats as st
cross apply sys.dm_exec_query_plan (st.plan_handle) as pl
cross apply pl.query_plan.nodes('//RelOp[./*/Object/@Database]') as op(x)
cross apply op.x.nodes('./*/Object') as ob(ox)

Best Answer

Related Solutions

Sql-server – Track deleted rows/records

Sql-server – SQL Server 2008 R2 DMV Question

Related Question