We are moving data from Staging Database into Kimball Data Warehouse DB.
I am hearing different ideas regarding when to disable and rebuild indexes.
Stage tables are truncated every day for each new load. Lets assume we have a customer transaction table, 20 million rows, 5GB. Initial load is 20 mil, and incremental is 100,000 rows everyday (less than 5%)
I am reading its best to disable indexes on new batch fresh loads.
(a) If stage tables are truncated everyday, for Stage: Should I disable index and reenable everyday?
Alter index all on dbo.[StageTable] disable
Insert Into StageTable values (CustomerId, SalesDate, SaleAmount)...
Alter index all on dbo.[StageTable] rebuild
In the actual data warehouse,
(b) Should I reenable and rebuild index for initial load. Should I do this for incremental loads also which are minor?
Alter index all on dbo.[DWTable] disable
Insert Into StageTable values (CustomerId, SalesDate, SaleAmount)...
Alter index all on dbo.[DWTable] rebuild
Best Answer
The answer depends on a number of factors:
So determine what your goal is, and then test each method. For a 5 GB load, it should be a quick test.
Having said all that, for a 5 GB load I wouldn't bother trying to optimize it. SQL Server can write that in just a few seconds. Pick a strategy and run it--if it takes two minutes, then you can only make it about 2 minutes faster--is it worth your time? If you're approaching this as a learning exercise, do it with a significant amount of data--something that will take 30 minutes.