Sql-server – Copying large amounts of data to another target tables

etlmigrationperformanceperformance-tuningsql serversql-server-2012

I'm migrating large amounts of data +40m of rows to a new version of the tables, my question would be more like, what could be the best way to improve the performance in terms of logging and execution time etc.

I have considered the below check list:

Set recovery model to bulk-logged recovery model. After bulk importing the data, switch the recovery model back to the full recovery model.
Load data with the TABLOCK hint to release any locking during the executions.
Set MAXDOP to 30 and change it back to normal.

Example of the statements:

/*Insert Example*/

INSERT INTO [dbo].[Order](

      [CREATED]
      ,[LAST_MODIFIED]
      ,[number]
      ,[LAST_MODIFIED]
      ,[Owner_FK] 
      ,[Owner2_FK]
      ,[Owner3_FK]


SELECT        
      [CREATED]
      ,[LAST_MODIFIED]
      ,[number]
      ,[LAST_MODIFIED]
      ,[Owner_FK] 
      ,[Owner2_FK]
      ,[Owner3_FK]

FROM [Order_old]
WHERE [Owner_FK] IS NOT NULL 
   OR [Owner2_FK] IS NOT NULL 
   OR [Owner3_FK] IS NOT NULL 


/*Delete Example*/

DELETE Ord FROM [Order] Ord 
WHERE [Owner_FK] IS NOT NULL 
   OR [Owner2_FK] IS NOT NULL 
   OR [Owner3_FK] IS NOT NULL 


/*Update Example*/

UPDATE Order_Type
SET [TYPE]=NULL
WHERE [TYPE]=''

I thought about integration services but that is a limitation I have, Also I had a look at BCP tool, but seems there is no way to transfer from one table to another unless we export it to a file and then we import it from that file.

Best Answer

I recommend reading through the data loading performance guide, fully understanding your maintenance window and what kind of locks you can take during it, and testing on a different database with table sizes as close to the ones in the production. The question that you asked is pretty generic, but one big thing that you missed is that you should consider disabling nonclustered indexes before loading a large amount of data. It can be more efficient to rebuild the indexes after the data load is complete.

Set Recovery Model to bulk-logged recovery model. After bulk importing the data, switch the recovery model back to the full recovery model.

Make sure that you understand the consequences of the switch before doing it. A recovery model of bulk-logged can help if it makes some of your queries be eligible for minimal logging. Under the right conditions the insert query can be eligible for minimal logging (target table is empty and you have TABLOCK). Read the data performance guide for full details. The update and delete queries will not benefit from the recovery model switch unless you disable and rebuild indexes as part of that process.

Add table hit TABLOCK to release any locking during the executions.

This is one of the requirements for minimal logging within SQL Server but otherwise isn't likely to really help you.

Set MAXDOP to 30 and change it back to normal.

What makes you want to use a MAXDOP of 30? The typical advice is to limit your queries to MAXDOP 8 and to consider going over 8 only for specific operations that can parallelize well such as index rebuilds. All of the queries that you listed in your question aren't likely to parallelize well. On SQL Server 2012 you aren't eligible for parallel insert. The part of the query that finds the data to be inserted is eligible for parallelism, but do you really want 30 parallel workers for that part? The delete and update parts of the queries are also single-thread.

Related Solutions

Sql-server – Import CSV File

From what you are describing it sounds like you needs to provide the BULK INSERT command a format file so it knows how to parse your data. You can use the bcp command to auto-generate one for you, however I have always had to edit them afterwards before they work properly. (Lots of testing I am afraid).

The links below should be good starting points for you:

Format Files: http://msdn.microsoft.com/en-us/library/ms190393.aspx

BCP: http://msdn.microsoft.com/en-us/library/ms162802.aspx

BULK INSERT: http://msdn.microsoft.com/en-us/library/ms188365.aspx

EXAMPLE: http://msdn.microsoft.com/en-us/library/ms178129.aspx

I hope this helps you.

EDIT

Based on the extra information that you have added above it looks like you have set all field terminators to pipes, which is not what you want. You will need to edit the format file on a per column basis.

For instance, in the example above your first two columns are delimited by a semi-colon. The third would need to be delimited by ";|". Here is an example of what I mean for the first few columns:

9.0 
18 
1       SQLCHAR       0       510     ";"     1     web1numdoss 
2       SQLCHAR       0       510     ";"     2     web1dem 
3       SQLCHAR       0       510     ";|"     3     web1def 
...
18      SQLCHAR       0       510     "\r\n"   18    Col018

Sql-server – Merge replication throws PRIMARY KEY violation during the initialization of Subscriber

Have seen this on numerous occasions. Try these steps to resolve your issue.

Remove table from replication
Reinitialize subscription
Add table to replication
Reinitialize subscription

Best Answer

Related Solutions

Sql-server – Import CSV File

Sql-server – Merge replication throws PRIMARY KEY violation during the initialization of Subscriber

Related Question