Sql-server – To maximize DB performance, which commands should be run after loading large amounts of data into SQL Server 2008 via SSIS

performancesql serversql-server-2008-r2ssis

I have written an SSIS package to load test data into an empty database. Some of the tables are very large (~700 million rows). Once the SSIS package has completed are there any commands I should run (as an apprentice DBA!) to maximize the performance of the database?

For instance, I executed EXEC sp_updatestats but it reported that no indexes required updating.

Is there a list of things to do once large amounts of data has been loaded or does SQL Server 2008 just take care of all that for you?

Best Answer

If you're loading to an empty database, you could/should take steps to avoid requiring any additional maintenance steps post load. Fragmentation is the enemy, that's what you're trying to avoid.

  • Drop all NC indexes before loading.
  • Post load, add the NC indexes for each table in sequence i.e. don't add an index to TableA, then TableB, then back to TableA. This is has no effect on fragmentation but it can improve the time taken to add the indexes on very large datasets (reduces buffer pool churn).
  • If you can insert data to a table in clustered index order, leave the clustered index in place. If you can't, dump the data into a heap and rebuild in to a clustered index post load.

Rather than copy and paste, I'll point you toward the very comprehensive list of references for ETL optimisation @Marian put together in an answer to a question I asked on BCP. Many of these will be equally applicable to your scenario.