As long as you are sure the server was up for that entire time, and that nobody cleared out DMV stats inadvertently. This can happen if the database is detached + re-attached / restored / offline + online / auto-close + online, or if the index has been explicitly dropped / re-created (the DMV is not affected by disable / rebuild / reorganize, except in the case of SQL Server 2012, where rebuild currently clears the stats - I suspect that will be fixed - thanks @MartinSmith).
You should expect that if a DML activity used the index in a read capacity for some reason (e.g. to perform an update on some other table), this would register as a read activity, not a write. All of the writes you see are index maintenance.
As an aside, here is a slightly more efficient way to write this query. I removed the twice-referenced correlated subquery and dropped the unnecessary joins to sys.objects
and sys.schemas
. I also fixed a few minor syntax things, such as prefixing all columns with aliases where appropriate, placing reserved words in square brackets, and removing AS 'column alias'
syntax. I found it confusing that half way through the SELECT
list you changed from alias = expression
syntax to expression AS alias
syntax - you should pick one and be consistent IMHO. I changed these both for majority within the query and also according to my personal preference. :-)
SELECT name = OBJECT_NAME(s.[object_id])
, indexname = i.name
, i.index_id
, reads = s.user_seeks + s.user_scans + s.user_lookups
, writes = s.user_updates
, g.[rows]
, reads_per_write = CASE
WHEN s.user_updates < 1 THEN 100.0
ELSE 1.00 * (s.user_seeks + s.user_scans + s.user_lookups) / s.user_updates
END
, [drop statement] = 'DROP INDEX ' + QUOTENAME(i.name)
+ ' ON ' + QUOTENAME(OBJECT_SCHEMA_NAME(s.[object_id]))
+ '.' + QUOTENAME(OBJECT_NAME(s.[object_id]))
FROM sys.dm_db_index_usage_stats AS s
INNER JOIN sys.indexes AS i
ON i.index_id = s.index_id
AND s.object_id = i.object_id
INNER JOIN
(
SELECT [object_id], index_id, [rows] = SUM([rows])
FROM sys.partitions GROUP BY [object_id], index_id
HAVING SUM([rows]) > 2000
) AS g
ON i.[object_id] = g.[object_id]
AND i.index_id = g.index_id
WHERE OBJECTPROPERTY(s.[object_id], 'IsUserTable') = 1
AND s.database_id = DB_ID()
AND i.type_desc = 'nonclustered'
AND i.is_primary_key = 0
AND i.is_unique_constraint = 0
ORDER BY name, reads;
OK, I ended up adding another lookup table:
CREATE TABLE IF NOT EXISTS `stops_routes` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`stop_id` varchar(100) NOT NULL,
`route_id` varchar(100) NOT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `stop_route` (`stop_id`,`route_id`),
KEY `stop_id` (`stop_id`),
KEY `route_id` (`route_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
Filling it was fairly fast:
mysql> insert into stops_routes (stop_id, route_id)
->
-> select
-> s.stop_id,
-> r.route_id as from_route_id
->
-> from routes r
-> left join trips t on t.route_id = r.route_id
-> left join stop_times st on st.trip_id = t.trip_id
-> left join stops s on s.stop_id = st.stop_id
-> group by s.stop_id, r.route_id;
Query OK, 3496 rows affected (8.38 sec)
Records: 3496 Duplicates: 0 Warnings: 0
Using it is blazingly fast:
mysql> select
-> r.route_id as from_route_id,
-> c_sr.route_id as to_route_id
->
-> from routes r
->
-> left join stops_routes sr on sr.route_id = r.route_id
-> left join stop_connections c_s on c_s.from_stop_id = sr.stop_id
-> left join stops_routes c_sr on c_sr.stop_id = c_s.to_stop_id
->
-> where r.route_id <> c_sr.route_id
-> group by r.route_id, c_sr.route_id
-> limit 10;
+---------------+-------------+
| from_route_id | to_route_id |
+---------------+-------------+
| 0001 | 0002 |
| 0001 | 0003 |
| 0001 | 0004 |
| 0001 | 0005 |
| 0001 | 0006 |
| 0001 | 0008 |
| 0001 | 0009 |
| 0001 | 0011 |
| 0001 | 0014 |
| 0001 | 0031 |
+---------------+-------------+
10 rows in set (0.63 sec)
Now I can fill my last lookup table (set of connections between every routes on my GTFS network):
mysql> insert into route_connections (from_route_id, to_route_id)
-> select
-> r.route_id as from_route_id,
-> c_sr.route_id as to_route_id
->
-> from routes r
->
-> left join stops_routes sr on sr.route_id = r.route_id
-> left join stop_connections c_s on c_s.from_stop_id = sr.stop_id
-> left join stops_routes c_sr on c_sr.stop_id = c_s.to_stop_id
->
-> where r.route_id <> c_sr.route_id
-> group by r.route_id, c_sr.route_id;
Query OK, 2848 rows affected (0.31 sec)
Records: 2848 Duplicates: 0 Warnings: 0
Amazingly fast. I guess the engine couldn't break up the steps to optimize this.
I'd still be interested to know if it would be possible to get the same result (from route to route connections table) using only one sub-second or sub-minute query.
Best Answer
You can't reset the DMVs, however you can work around this limitation and remove rows from the DMVs by creating a small filtered index on the tables mentioned in the DMVs then immediately dropping that index.
For instance:
I've created a script to automate this process.
This can be executed with the
@Database
parameter set to the name of a database to eliminate recommended indexes only related to that database, or without parameters to eliminate all recommended indexes. One can optionally limit this to the suggestions for a single table by passing the name of the table into the@Table
parameter.It creates at most one index per table. The index has a unique name and is constructed using a single column, giving priority to using the table's
PRIMARY KEY
, if it has one. Tables will be missed by this proc if the table does not have a primary key or at least one of the following types of columns:I've written a short blog post on this issue over at SQLServerScience