SQL Server Index – Difference Between Average and Total Fragmentation

fragmentationsql server

Using SQL server, If i navigate to a table in the object explorer, go to its indexes, go to its properties, and view the fragmentation tab, the top section has 2 lines for me, the Page fullness in percent, and the Total Fragmentation in percent.

I also have a query i occasional use for index information. Trimming out some unrelated fields, this is the query:

SELECT t.name, ix.name, avg_fragmentation_in_percent
FROM sys.indexes AS ix
INNER JOIN sys.tables t ON t.object_id = ix.object_id
INNER JOIN sys.schemas s ON t.schema_id = s.schema_id
INNER JOIN sys.dm_db_index_physical_stats  (db_id(), NULL, NULL, NULL, 'DETAILED') ps ON t.object_id = ps.object_id AND ix.index_id = ps.index_id
WHERE  ix.index_id > 0
    AND ix.name IS NOT NULL

When i run this query, i get 3 columns, the first 2 pointing to the index itself, and for this query, the third is a fragmentation in percent for that index.

These 2 numbers are Usually they are very close, sometimes identical, but it also can be and frequently is different. I have had cases where the discrepancy is massive, over 50% in some cases, but normally it is something like <1% vs 15-30%. The queries result is almost always the higher percent.

Rebuilding the index will completely or nearly zero both of these numbers (though not necessarily will both be zero). So i now have 2 numbers for fragmentation on an index. According to the documentation for sys.dm_db_index_physical_stats and Index Properties(FragmentationPage) Both are supposed to be the logical Fragmentation of the index, so they should always be the same number. How is it that i am getting different numbers for what should be the same statistic?

Best Answer

There's a big difference between your query and the query the UI sends (which you can spy on using server-side trace, Extended Events, etc). In yours, the last argument to sys.dm_db_index_physical_stats is DETAILED:

... sys.dm_db_index_physical_stats( ..., 'DETAILED') ...

Whereas the UI is on a budget and sends SAMPLED instead:

... sys.dm_db_index_physical_stats( ..., 'SAMPLED') ...

DETAILED scans all of the pages, while SAMPLED only scans 1 percent. That's right, 1%. You pay for more accuracy obviously, but relying on the info about every 100th page to make any judgments about the state of the index is, well, not very useful. You could always change your manual query to SAMPLED if your goal is to just match the uselessness of the UI. :-)

(In fairness, this topic about the UI properties page should explicitly document the method that is used to determine fragmentation. I have a confirmation from Microsoft's Pedro Lopes that he will keep it in mind for an upcoming doc revision.)

Just never use or trust the UI. Never mind that it's a pain in the rear to look at more than one index at a time, it's also cumbersome to figure out what it's even doing, and you always have to worry about whether it is still doing what you used to think it did... it's just not worth the occasional expediency in looking up one thing rather than having a verifiable, repeatable query to run...