Sql-server – Index Maintenance for DW- DB ( ETL process )

data-warehouseindex-maintenanceindex-tuningola-hallengrensql server

I'm planning to set up a t-sql job to perform index maintenance in a DW DB (Stores data through an ETL process) currently overnight and eventually will need to be run more frequently as the business grows.

The question is, which considerations should be taken in order to have an efficient Index Maintenance for an DW DB that usually loads data from an ETL process.

Currently I have in mind to implement the MP used by Ola Hallengren

T-SQL Code Below:

EXECUTE dbo.IndexOptimize
@Databases = 'StackOverflow2010',
@FragmentationLow = NULL,
@FragmentationMedium  = 'INDEX_REORGANIZE,INDEX_REBUILD_ONLINE,INDEX_REBUILD_OFFLINE',
@FragmentationHigh = 'INDEX_REBUILD_ONLINE,INDEX_REBUILD_OFFLINE',
@FragmentationLevel1 = 30,
@FragmentationLevel2 = 70,
@UpdateStatistics = 'ALL'

If you have any additional comments or suggestions will be very appreciated

Best Answer

If you are worried about index fragmentation (which really only matters if you are doing range scans off disk) but you are loading your data periodically during an ETL process I'm assuming you have your indexes enabled (and updated) during the ETL process.

When you load your data your indexes have to be updated all the time while inserting data but end up being fragmented which is why you want to rebuild them again.

Maintaining the index when loading data only to rebuild them later is useless work for SQL Server.

One approach I would suggest considering is the following

  • Disable all your non-clustered indexes on a table before you load data into it
  • Load your data in the order of the clustered key if you have one (most datawarehouses use a surrogate key anyway)
  • Enable your indexes again after you loaded your data. That way the index is only built once after the data load and will be contiguous.

The code to disable and enable/rebuild all indexes for a table would be the following

ALTER INDEX ALL ON dbo.[Table Name] DISABLE

-- perform data load

ALTER INDEX ALL ON dbo.[Table Name] REBUILD