Sql-server – Copying large amounts of data to another target tables

etlmigrationperformanceperformance-tuningsql serversql-server-2012

I'm migrating large amounts of data +40m of rows to a new version of the tables, my question would be more like, what could be the best way to improve the performance in terms of logging and execution time etc.

I have considered the below check list:

  1. Set recovery model to bulk-logged recovery model. After bulk importing the data, switch the recovery model back to the full recovery model.
  2. Load data with the TABLOCK hint to release any locking during the executions.
  3. Set MAXDOP to 30 and change it back to normal.

Example of the statements:

/*Insert Example*/

INSERT INTO [dbo].[Order](

      [CREATED]
      ,[LAST_MODIFIED]
      ,[number]
      ,[LAST_MODIFIED]
      ,[Owner_FK] 
      ,[Owner2_FK]
      ,[Owner3_FK]


SELECT        
      [CREATED]
      ,[LAST_MODIFIED]
      ,[number]
      ,[LAST_MODIFIED]
      ,[Owner_FK] 
      ,[Owner2_FK]
      ,[Owner3_FK]

FROM [Order_old]
WHERE [Owner_FK] IS NOT NULL 
   OR [Owner2_FK] IS NOT NULL 
   OR [Owner3_FK] IS NOT NULL 


/*Delete Example*/

DELETE Ord FROM [Order] Ord 
WHERE [Owner_FK] IS NOT NULL 
   OR [Owner2_FK] IS NOT NULL 
   OR [Owner3_FK] IS NOT NULL 


/*Update Example*/

UPDATE Order_Type
SET [TYPE]=NULL
WHERE [TYPE]=''

I thought about integration services but that is a limitation I have, Also I had a look at BCP tool, but seems there is no way to transfer from one table to another unless we export it to a file and then we import it from that file.

Best Answer

I recommend reading through the data loading performance guide, fully understanding your maintenance window and what kind of locks you can take during it, and testing on a different database with table sizes as close to the ones in the production. The question that you asked is pretty generic, but one big thing that you missed is that you should consider disabling nonclustered indexes before loading a large amount of data. It can be more efficient to rebuild the indexes after the data load is complete.

Set Recovery Model to bulk-logged recovery model. After bulk importing the data, switch the recovery model back to the full recovery model.

Make sure that you understand the consequences of the switch before doing it. A recovery model of bulk-logged can help if it makes some of your queries be eligible for minimal logging. Under the right conditions the insert query can be eligible for minimal logging (target table is empty and you have TABLOCK). Read the data performance guide for full details. The update and delete queries will not benefit from the recovery model switch unless you disable and rebuild indexes as part of that process.

Add table hit TABLOCK to release any locking during the executions.

This is one of the requirements for minimal logging within SQL Server but otherwise isn't likely to really help you.

Set MAXDOP to 30 and change it back to normal.

What makes you want to use a MAXDOP of 30? The typical advice is to limit your queries to MAXDOP 8 and to consider going over 8 only for specific operations that can parallelize well such as index rebuilds. All of the queries that you listed in your question aren't likely to parallelize well. On SQL Server 2012 you aren't eligible for parallel insert. The part of the query that finds the data to be inserted is eligible for parallelism, but do you really want 30 parallel workers for that part? The delete and update parts of the queries are also single-thread.