Sql-server – Filegroup: [primary] dbcc shrinkfile() need balanced extent counts per file when done

performanceperformance-tuningsql servert-sql

I'm working with a ~1 terabyte db, server 40+cores, memory 500G+, db files on SAN (no control over lun segments), the bottleneck now is file contention. Currently running 2014 but by final "release" will be likely the latest edition of enterprise.

Separated out by schema size (we don't use dbo) to the largest chunks data usage. Found canditates that had size on disk around 200G, put them into filegroups w/4 files ea. Final phase is to take [primary] and split into 4 as well. below is the proc I made to move a schema into a filegroup, it will not work for primary in the desired way.

alter proc dbo.admin_MoveSchema ( 
         @schema varchar(128) 
        ,@exec bit = 0 
        ,@storage varchar(2048) = null 
        ,@filegroup varchar(128) = null 
        ,@files int = 4 
) 
as 
/*  *************************************************************************** 
 *  example:   
    exec dbo.admin_MoveSchema @schema = 'Schema1', @exec = 1 
 *  *************************************************************************** 
 */ 

begin try 
    declare 
         @msg varchar(255) 
        ,@separator varchar(255) = replicate('=',128)
    set nocount on; 
    if nullif(@storage,'') is null 
        begin 
            set @storage = (select top 1 reverse(right(reverse(physical_name),abs(charindex('\',reverse(physical_name)) - len(physical_name)))) + '\' from sys.database_files where physical_name like '%.mdf') 
        end; 
    set @filegroup = rtrim(ltrim(replace(replace(@filegroup,']',''),'[','')));
    if nullif(@filegroup,'') is null 
        begin 
            set @filegroup = 'FG_' + @schema  /* will obviously fail if the schema name is already at max size of 128 */ 
        end; 
    declare 
         @s nvarchar(max) 
        ,@i int = 1 
    set @msg = 'Creating Filegroup ' + @filegroup; 
    print @separator; 
    print '||' + replicate(' ', 39) + @msg; 
    raiserror(@separator,0,0) with nowait; 
    begin try 
        set @s = ' 
if not exists (select 1 from sys.filegroups where name = ''' + @filegroup + ''') and ''' + @filegroup + ''' <> ''primary'' 
    begin 
        alter database ' + quotename(db_name()) + ' 
        add filegroup ' + quotename(@filegroup) + '; 
        print ''' + quotename(@filegroup) + ' added!'' 
    end 
        else 
    begin 
        print ''' + quotename(@filegroup) + ' exists.'' 
    end;' 
        if @exec = 1 
            begin 
                exec(@s);
            end 
                else 
            begin 
                print(@s); 
            end; 
        set @msg = 'Creating Files for ' + @filegroup; 
        print @separator; 
        print '||' + replicate(' ', 39) + @msg; 
        raiserror(@separator,0,0) with nowait; 
        while @i <= @files 
            begin 
                set @s = '
if not exists (select * from sys.sysfiles where name = ''' + @filegroup + '_' + right('00' + rtrim(@i),3) + ''') 
    begin 
        alter database [' + db_name() + '] 
        add file 
        ( 
            name = ' + quotename(@filegroup + '_' + right('00' + rtrim(@i),3)) + ', 
            filename = ''' + @storage + @filegroup + '_' + right('00' + rtrim(@i),3) + '.ndf'', 
            size = 8mb, 
            maxsize = unlimited, 
            filegrowth = 10% 
        ) 
        to filegroup ' + quotename(@filegroup) + '; 
        print ''added file: ' + quotename(@filegroup + '_' + right('00' + rtrim(@i),3)) + ''';
    end
        else
    begin
        print ''' + quotename(@filegroup + '_' + right('00' + rtrim(@i),3)) + ' exists'';
    end; '; 
                if @exec = 1 
                    begin 
                        exec(@s); 
                    end 
                        else 
                    begin 
                        print(@s); 
                    end; 
                set @i = @i + 1; 
            end; 
    end try 
    begin catch 
        print error_message(); 
    end catch; 

    declare 
         @schema_name varchar(128) 
        ,@table_name varchar(128) 
        ,@column_name varchar(max) -- collection of columns with asc/desc
        ,@index_name varchar(128) 
    set @msg = 'Moving tables in the schema ' + quotename(@schema) + ' to filegroup ' + quotename(@filegroup); 
    print @separator; 
    print '||' + replicate(' ',18) + @msg; 
    raiserror(@separator,0,0) with nowait; 
    declare idxGen cursor local fast_forward
    for 
    select 
         s.name schema_name
        ,t.name table_name
        ,i.name index_name 
        ,stuff((select convert(varchar(max),',') + quotename(c.name) + (case when ic.is_descending_key = 1 then ' desc' else ' asc' end)
            from sys.columns c
            inner join sys.index_columns ic on c.column_id = ic.column_id and c.object_id = ic.object_id and ic.index_id = i.index_id
            where c.column_id = ic.column_id and t.object_id = ic.object_id 
            order by ic.key_ordinal for xml path('')),1,1,'') column_name 
    from sys.indexes (nolock) i 
    inner join sys.tables (nolock) t on t.object_id = i.object_id and i.type_desc = 'CLUSTERED' 
    inner join sys.schemas (nolock) s on s.schema_id = t.schema_id and t.is_ms_shipped = 0 
    where s.name = @schema 
    open idxGen; 
    fetch next from idxGen into @schema_name,@table_name,@index_name,@column_name; 
            /* 
                first match wins, covers unique clustered, clustered and "with" defaults on most tables 
                for unique and not. I'm not sure if the last 2 every trigger, will check later. 
             */ 
    while @@fetch_status = 0 
        begin 
                select @s = ' 
begin try 
    begin try 
        create unique clustered index ' + quotename(@index_name) + ' on ' + quotename(@schema_name) + '.' + quotename(@table_name) + ' (' + @column_name + ') 
        with (drop_existing=on) 
        on ' + quotename(@filegroup) + '; 
    end try 
    begin catch 
        create clustered index ' + quotename(@index_name) + ' on ' + quotename(@schema_name) + '.' + quotename(@table_name) + ' (' + @column_name + ') 
        with (drop_existing=on) 
        on ' + quotename(@filegroup) + '; 
    end catch; 
    raiserror(''[Moved/On]: ' + quotename(@schema_name) + '.' + quotename(@table_name) + ' to ' + quotename(@filegroup) + ''', 0,0);
end try 
begin catch 
    print error_message(); 
    raiserror(''[### MOVE FAILED ###]: ' + quotename(@schema_name) + '.' + quotename(@table_name) + ''', 0,0);
end catch; '; 
            if @exec = 1 
                begin 
                    exec(@s); 
                end 
                    else 
                begin 
                    print(@s); 
                end; 
            fetch next from idxGen into @schema_name,@table_name,@index_name,@column_name; 
        end; 
    close idxGen; 
    deallocate idxGen; 
    print @separator; 
    print '||' + replicate(' ',15) + 'Moved Tables in schema:' + quotename(@schema) + ' to filegroup:' + quotename(@filegroup); 
    print '||' + replicate(' ',15) + 'If any [### MOVE FAILED ###] were printed above, manual move of that table may be needed.' 
    print @separator; 
    print ' '; 

end try 
begin catch 
    print 'Error! find out why!'; 
    throw
end catch 
GO 

My problem now is to automate and/or replicate easily splitting the [primary] data into it's own 4 split.

The process of adding files and running dbcc shrinkfile(1,emptyfile) is fine but I want to stop the process automatically when the page counts are at or close to eachother for all files.

I was thinking of creating and starting a sql agent job executing a loop to check pages for each file and stop it by kill (spid of dbcc shrinkfile) when the files were >= the mdf pages, but I would really prefer to keep it all in one script without external dependencies/requirements (like sql agent started).

As is I need to open a new query and execute this until I near matching, then stop execution of the dbcc command.

if object_id('tempdb..#fileStats','U') is not null
    begin
        drop table #fileStats;
    end;

create table #fileStats (
     FileId int
    ,FileGroup int
    ,TotalExtents bigint
    ,UserExtents bigint
    ,Name nvarchar(128)
    ,FileName nvarchar(2048)
);
insert into #fileStats (FileId, FileGroup, TotalExtents, UserExtents, Name, FileName)
exec('dbcc showfilestats');

select 
     UserExtents - (select sum(userextents) / count(FileId) from #fileStats where FileGroup = 1) extleft
    ,UserExtents
    ,Name
    ,FileName
from #fileStats 
where FileGroup = 1 
;

I can put that in a loop with a waitfordelay and kill/exit near match but I am trying to take the manual process out of it. It is required to be executed by unskilled or automated processes.

Any ideas? (asked on SO to no avail already)

Best Answer

If you use a file for temporary storage you can simplify this.

  1. Add 1 additional file to primary
  2. Run EMPTYFILE to move all data to the temporary file (note: this will error on primary data file at the end but you can ignore and continue)
  3. Add 3 data files to PRIMARY
  4. Run EMPTYFILE on the temporary file.

During Step 4, SQL will move the data into the four files using the proportional fill algorithm resulting in 4 near identically sized files.