SQL Server – Inconsistent dm_db_index_physical_stats Results Explained

fragmentationindexsql serversql-server-2008

SQL Server 2008 RTM (10.0.1600.22)

Running a typical index fragmentation report on a table:

SELECT OBJECT_NAME(i.OBJECT_ID) AS TableName,
       i.name AS IndexName,
       s.index_type_desc,
       s.avg_fragmentation_in_percent,
       s.fragment_count, 
       s.page_count,
       s.record_count
FROM
    sys.dm_db_index_physical_stats(DB_ID(), OBJECT_ID('RPT_XXX'), NULL, NULL, 'DETAILED') s
    INNER JOIN
    sys.indexes i ON i.OBJECT_ID = s.OBJECT_ID
                       AND i.index_id = s.index_id
                       and i.object_id = OBJECT_ID('RPT_XXX')
WHERE
    s.avg_fragmentation_in_percent > 25

tells me that one index on that table has 7 fragments, 7 pages, about 700 rows and over 28% fragmentation. However, when I look at the index storage properties in the Management Studio, I get nearly 9000 pages, over 12 million rows, and 0.01% fragmentation, which are the expected numbers because the table is big and I just did the index rebuild and updated statistics on the table.

What, apart from a possible bug in the unsupported RTM version of SQL Server, might be the reasons for this discrepancy?

proof

Best Answer

Before I gave anyone else a chance to answer this, I found the problem with my query.

dm_db_index_physical_stats(), when running in the 'DETAILED' mode, returns one row for each index level. In my case the leaf level of the index indeed has 0.01% fragmentation and is therefore filtered out by the s.avg_fragmentation_in_percent > 25 criteria, leaving only one intermediate level in the result.

Arguably you would only want to look at the leaf level of the index to identify fragmentation, so the query should look like so:

SELECT OBJECT_NAME(i.OBJECT_ID) AS TableName,
       i.name AS IndexName,
       s.index_type_desc,
       s.avg_fragmentation_in_percent,
       s.fragment_count, 
       s.page_count,
       s.record_count
FROM
    sys.dm_db_index_physical_stats(DB_ID(), OBJECT_ID('RPT_XXX'), NULL, NULL, 'DETAILED') s
    INNER JOIN
    sys.indexes i ON i.OBJECT_ID = s.OBJECT_ID
                       AND i.index_id = s.index_id
                       AND i.object_id = OBJECT_ID('RPT_XXX')
WHERE
    s.index_level = 0 AND           -- leaf level only
    i.is_hypothetical = N'0' AND    -- exclude hypothetical indexes
    s.page_count > 1000 AND         -- look at large indexes only
    s.avg_fragmentation_in_percent > 25

The criteria i.is_hypothetical = N'0' comes from observing the query generated by the Management Studio in the trace, as suggested by Aaron Bertrand.

Edit: It may also make sense to exclude smaller indexes from this analysis, because their fragmentation a) may not be important and b) may not be significantly improved.