Sql-server – reporting space consumption to files in filegroup – how to reverse-engineer the algorithm spreading data across files

disk-spacedmvmonitoringperformanceperformance-tuningsql server

I would like to know how to calculate the distribution of data consumed per file in a filegroup, back to the index (HEAP, CLUSTERED, NONCLUSTERED) storing it. My intention is to define which I/O goes where on disk.

I get to data_space_id level from sys.indexes, showing pages used, allocated; and data_space_id size from sys.filegroups. So I get to where the weighted (by free space ratio?) algorithm for storing data to files within the filegroup takes effect. I can join to sys.database_files using data_space_id.

From sys.dm_allocation_units (joined to indexes by object_Id and index_Id) I get partition_ID; joining with sys.dm_partitions advises row count, pages used and allocated, allowing a calculation to show also what is free per partition. Can't get to partition to file…?

I have a query which I apportion DATA to FILE based on ratio of used pages per file in a filegroup, applying this ratio to the index data stored on the file group the files belong to.

Is there a better way of drilling down table/index data to file level allocation? (Measure instead of compute?)

For indid = 0 or indid = 1, dpages is the count of data pages used.
For indid > 1, dpages is the count of index pages used.

For indid = 0 or indid = 1, used is the count of the total pages used for all index and table data.
For indid > 1, used is the count of pages used for the index.

For indid = 0 or indid = 1, reserved is the count of pages allocated for all indexes and table data.
For indid > 1, reserved is the count of pages allocated for the index.

SQL:

Select  T.Name TableName,
        ISNULL(SI.Name, SI.type_desc) IndexName, 
    SI.index_id, 
    SI.type_desc, 
    SI.data_space_id,
    S.rows Rows,
    S.rowmodctr,
    PIx.avg_fragmentation_in_percent,
    PIx.fragment_count,
    PIx.avg_fragment_size_in_pages,
    CASE SI.Index_ID
    WHEN 0 THEN S.dpages
    WHEN 1 THEN S.dpages
    END DataPages,
    CASE 
    WHEN SI.index_id > 1
        THEN S.dpages
    END IndexPages,
    FileUsageRatio,
    S.dpages*FileUsageRatio RationedDataPagesToFile,
    physical_name,
    CASE SI.Index_ID
    WHEN 0 THEN S.Used 
    WHEN 1 THEN S.Used
    END UsedTableDataPages,
    CASE 
    WHEN SI.index_id > 1
        THEN S.Used
    END UsedIndexPages,
    CASE SI.Index_ID
    WHEN 0 THEN S.reserved 
    WHEN 1 THEN S.reserved
    END ReservedTableDataPages,
    CASE 
    WHEN SI.index_id > 1
        THEN S.Reserved
    END ReservedIndexPages,
    FG.name FileGroupName,
    FG_SpaceUsage.FG_AllocatePages,
    FG_SpaceUsage.FG_UsedPages,
    FG_SpaceUsage.FG_FreePages,
    FG.type_desc FileGroup_Type_desc,
    OIx.singleton_lookup_count,
    OIx.range_scan_count,
    OIx.page_io_latch_wait_count,
    OIx.page_io_latch_wait_in_ms,
    OIx.page_latch_wait_count,
    OIx.page_latch_wait_in_ms,
    OIx.row_lock_count,
    OIx.row_lock_wait_count,
    OIx.row_lock_wait_in_ms,
    OIx.page_lock_count,
    OIx.page_lock_wait_count,
    OIx.page_lock_wait_in_ms
from SYS.tables T
JOIN sys.indexes SI
ON T.Object_ID = SI.Object_ID
JOIN sys.filegroups FG
ON FG.Data_Space_ID = SI.Data_Space_ID
JOIN sys.partitions P
ON P.index_id = SI.index_id
AND P.object_id = SI.object_id
JOIN sys.allocation_units AU
ON AU.allocation_unit_id = P.partition_id
JOIN sys.dm_db_index_operational_stats(db_id(),NULL,NULL,NULL) OIx
ON OIx.index_id = P.index_id
AND OIx.object_id = P.object_id
JOIN sys.dm_db_index_physical_stats(db_id(),NULL,NULL,NULL,NULL) PIx
ON PIx.index_id = P.index_id
AND PIx.object_id = P.object_id
JOIN sys.sysindexes S
ON S.IndID = SI.Index_ID
AND S.ID = SI.Object_ID
JOIN (
SELECT Data_Space_ID, SUM(SIZE) FG_AllocatePages, SUM(UsedPages) FG_UsedPages,   SUM   (SIZE) - SUM(UsedPages) FG_FreePages
FROM SYS.database_files DBF
    CROSS APPLY (   SELECT FILEPROPERTY(DBF.name, 'SpaceUsed') AS UsedPages) PagesUsed
GROUP BY DBF.data_space_id
) FG_SpaceUsage
ON FG_SpaceUsage.data_space_id = FG.data_space_id
JOIN 
(
 SELECT CTE.data_space_id, CTE.file_ID, CTE.physical_name,      CTE.UsedPages*1.0/AGGREGATED.Total_UsedPages FileUsageRatio
 FROM (
 SELECT data_space_Id, File_ID, Physical_Name, Size, PagesUsed.UsedPages, Size-     PagesUsed.UsedPages FreePages
 FROM sys.database_Files DBF
 CROSS APPLY (  SELECT FILEPROPERTY(DBF.name, 'SpaceUsed') AS UsedPages) PagesUsed
 ) CTE
JOIN
    (SELECT Data_Space_ID, SUM(UsedPages) Total_UsedPages
     FROM (
            SELECT data_space_Id, File_ID, Physical_Name, Size, PagesUsed.UsedPages, Size-PagesUsed.UsedPages FreePages
            FROM sys.database_Files DBF
            CROSS APPLY (   SELECT FILEPROPERTY(DBF.name, 'SpaceUsed') AS UsedPages) PagesUsed
            ) CTE
     GROUP BY Data_Space_ID
     ) AGGREGATED
ON CTE.data_space_id = AGGREGATED.data_space_id
) DataUsageRatio
ON DataUsageRatio.data_space_id = FG.data_space_id

ORDER BY TableName, Index_ID

Best Answer

Upon consideration, I can't do what I want to do, better than I am doing it. SQL Server doesn't store what goes where deeper than the filegroup level it seems. From there it's all internals only; and no exposed methods to view.

Background links:

Round Robin vs. Proportional Fill by Rob Nicholson
Understanding the -E Startup Parameter by James Rowland-Jones