Sql-server – SQL Server DMV to show if database is used

dmvsql server

What is the best indicator in finding out whether a database is utilized or accessed?

I tried index statistics and based on I/O utilization but none seem to be a good indicator.

Problem: I have a lot of databases on a development environment and due to space issues we are trying to decommission those that are not used without actually messing with the ones that are really needed.

Best Answer

SQL Server doesn't really track database access in the way you want, at least going backward (you can set up things like server-side trace, extended events, auditing, etc. going forward).

There is one ballpark thing you can use: DMVs that track index usage and procedure/trigger/query stats. For example:

;WITH d AS
(
  SELECT d = database_id, name FROM sys.databases
  WHERE state = 0 AND database_id BETWEEN 5 AND 32766
),
index_usage(d,lsk,lsc,llk,lupd) AS
(
  SELECT database_id, MAX(last_user_seek), MAX(last_user_scan),
    MAX(last_user_lookup), MAX(last_user_update)
  FROM sys.dm_db_index_usage_stats
  WHERE database_id BETWEEN 5 AND 32766
  GROUP BY database_id
),
proc_stats(d,lproc) AS
(
  SELECT database_id, MAX(last_execution_time) 
    FROM sys.dm_exec_procedure_stats
    WHERE database_id BETWEEN 5 AND 32766
    GROUP BY database_id
),
trig_stats(d,ltrig) AS
(
  SELECT database_id, MAX(last_execution_time)
    FROM sys.dm_exec_trigger_stats
    WHERE database_id BETWEEN 5 AND 32766
    GROUP BY database_id
),
query_stats(d,lquery) AS
(
  SELECT t.[dbid], MAX(s.last_execution_time) 
    FROM sys.dm_exec_query_stats AS s
    CROSS APPLY sys.dm_exec_sql_text(s.plan_handle) AS t
    WHERE t.[dbid] BETWEEN 5 AND 32766
    GROUP BY t.[dbid]
)
SELECT d.name,i.lsk,i.lsc,i.llk,i.lupd,p.lproc,t.ltrig,q.lquery
FROM d LEFT OUTER JOIN index_usage AS i ON d.d = i.d
LEFT OUTER JOIN proc_stats  AS p ON d.d = p.d
LEFT OUTER JOIN trig_stats  AS t ON d.d = t.d
LEFT OUTER JOIN query_stats AS q ON d.d = q.d;

Note that these statistics aren't completely reliable, since you may not have any stored procedures, and the queries found in sys.dm_exec_query_stats may reference more than one database, and may not ever reflect the one you are concerned about.

Also, they are reset when SQL Server is restarted, or a database is detached / attached or restored, or when a database is auto-closed, and can also depend in some cases on the plans still being in the cache (which another database could completely take over within minutes). So if you are looking into the past, unless you know none of these things have happened for an entire business cycle, I wouldn't rely on just these numbers to determine whether a database is used (there could also be automated processes that are making a database look current, even if you don't care that those automated processes will fail when you remove the database).

Another note is that certain index access may not be tracked in the index usage views; for example, in SQL Server 2014, which adds memory-optimized tables, activity against those hash indexes are not captured this way (and the views where you think the activity would be captured, like sys.dm_db_xtp_hash_index_stats, do not include any date/time columns). If you are using SQL Server 2014 and in-memory OLTP ("Hekaton"), you may want to add some research to cover those objects (in case they are the only ones being referenced in a database).

And one more note is that the queries captured by sys.dm_exec_query_stats could be false positive. For example, if your database has filestream/filetable, you will see these queries being run by the system occasionally:

select table_id, item_guid, oplsn_fseqno, oplsn_bOffset, oplsn_slotid
from [database].[sys].[filetable_updates_<some_id>] with (readpast) order by table_id

So, you may want to add additional filtering to the above query to filter those out (as long as the filter doesn't accidentally filter out queries you do care about). This is probably a safe addition to that derived table:

AND t.[text] NOT LIKE N'%oplsn_fseqno%'

In the end, the safest thing to do in a dev environment is to take the databases you're not sure about offline for a week. If nobody complains, back them up, and drop them. If it takes more than a week for someone to notice they're missing, you can always restore them (there or elsewhere).

I've blogged about this a bit as well: