Sql-server – Importing Data in Parallel in SQL Server

bulkbulk-insertparametersql server

I have more than 100 files to import into the sql server and most of them are of 500 MB. I want to leverage the the parallel import utility of SQL server and have read a number of webpages, like the following ones:

How to load 1 TB data in 30 minutes

https://technet.microsoft.com/en-us/library/dd537533(v=sql.100).aspx

Importing Data in Parallel with Table Level Locking

https://technet.microsoft.com/en-us/library/ms186341(v=sql.105).aspx

and the answers in stackoverflow

Fastest way to insert in parallel to a single table

However, none of them have given a simple example with code. I know how to use bulk insert/bcp, but I don't know where to start with parallel import? Can anyone help me with it?

My system is Windows and I'm using SQL server 2016. The source data file is in txt format.

Thanks in advance for your help!

Jason

Best Answer

Borrowing from my old answer

Below are some good ways to improve BULK INSERT operations :

  1. Using TABLOCK as query hint.
  2. Dropping Indexes during Bulk Load operation and then once it is completed then recreating them.
  3. Changing the Recovery model of database to be BULK_LOGGED during the load operation.
  4. If the target has Clustered Index then specifying ORDER BY clause in the bulk insert operation will increase the speed of BULK loading.
  5. Using Trace Flag 610 at the beginning of BULK INSERT operation.

The max degree of parallelism should be configured on the server rather than the default. You can refer to my answer on how it configure it here.

If you are using SQL Server 2014 and up then SELECT ... INTO is parallel.

Also, you should monitor Wait Statistics on the server especially SOS_SCHEDULER_YIELD resulting in scheduler contention on Servers having multiple CPUs running concurrent Bulk load operations and competing for the same CPU Cycles.

Also refer to :