SQL Server – Copy Huge Table with Transformations to Another Table

bulk-insertsql serversql-server-2016

I have a large table that was imported from a CSV using the import wizard. I have a query that selects from this table and replaces all of the string "code" columns with integer ids from other dimension tables and performs some transformations on some other columns. This all works well.

I need to insert the results of this query into a table in the database that has the required foreign key links and indices. This is also the table that the SQL objects (sprocs, functions, views) use to access the data.

My issue is that the insert takes forever (1.2 hrs) before it eventually dies by complaining about the size of the transaction log. I don't have access to log onto the server. I only have access through SSMS with db_owner rights on the database in question.

My questions are:

  1. How can I perform the insert without logging it in the transaction log? I know there is BULK INSERT and OPENROWSET(BULK ...) but these require a file.
  2. Would the insert go faster if I drop the indices? I think this would break the execution plan of the existing SQL objects that use the destination table. Is this advisable if there is a clustered index on the destination table?
  3. Is not logging the insert the correct way to go or should I use some other method?

Best Answer

If this is a one-time job, you might just use the BCP Utility to do this task.

It seems that your hanging point is doing a massive import that exceeds the resources of your disk space. The thing you need to do is to break the import into batches, which BCP supports, so as to avoid a transaction that cannot fit into your server.

When I have had to make a single massive data transfer, I have had good success using the BCP utility to extract the data to a file. This does, of course, require a file location to temporarily hold the extracted data.

Use The BCP Utility examples from MSDN as samples. (Scroll to see entire commands.)

Copy Data to File Storage:

 bcp WorlWideImporters.Warehouse.StockItemTransactions OUT D:\BCP\StockItemTransactions_native.bcp -m 1 -n -e D:\BCP\Error_out.log -o D:\BCP\Output_out.log -S ServerA -T

To import the data back into a database using BCP you should define a batch size, such as -b 5000 (commit transactions ever 5000 rows) so as to avoid filling up the transaction log.

You may also need to schedule the log backups more frequently during the massive import that you describe. Or you can switch to SIMPLE mode during the insert to the database.

Copy from File Storage to Database

bcp WorlWideImporters.Warehouse.StockItemTransactions_bcp IN D:\BCP\StockItemTransactions_native.bcp -b 5000 -h "TABLOCK" -m 1 -n -e D:\BCP\Error_in.log -o D:\BCP\Output_in.log -S ServerA -T

These are reliable tools, but not the fanciest. For importing data using BULK INSERT which is, in my experience, runs faster that BCP and it also has a parameter for controlling batch size.

You should reference The BCP Utility and BULK INSERT if your needs are more complex than the simple examples I have used.

However, please notice this post with several answers: Freeing Unused Space SQL Server Table

Note that Kin included the options mentioned is BCP and BULK INSERT, but there are other good options is the posts noted above.