Sql-server – sqlcmd slow against SQL Server 2008 R2

insertsql-server-2008

I am trying to import a longish SQL script (3.5 millions of lines, 1,5 GB in size) containing mainly short INSERT INTO statements with data using sqlcmd -E -d <database> -S 127.0.0.1 -i <scriptfile> and it is nearly crawling to a halt (approximatly 150 lines per second) without inferring any noticeable load on the SQL server host (all CPU cores are nearly idle, Disk throughput is at around 200 KB/s).

As the source file is located on an NFS share, I first suspected this to be the culprit, but an identical script available locally runs at just the same speed.

The database is basically empty, the tables are created by the very same script, no triggers or other fancyness – just the raw, mainly numeric or varchar data.

What is SQLCMD waiting for? Is there any way to speed it up?

Edit:

we are limited in our means to alter the data in the SQL script files. The data is provided by a third party for an import procedure. I believe it was originally exported using Management Studio 2005 "Script table as…" feature.

Editing the file is tedious due to sheer size – any operation takes forever with common text editors, although prepending a "SET NOCOUNT ON" was done by copying two files together – and it brought a speedup of approximately 50%.

Due to the text encoding (Unicode-LE) it is impossible to edit with the common set of GNU textutils (which would otherwise work fine with large files) without converting – and I am reluctant to converting because of data fidelity issues which are likely to come with it.

So I am somewhat stuck with how to apply the suggestions on inserting the BEGIN TRAN/COMMIT TRAN blocks or converting the individual inserts into larger sets.

Best Answer

Some ideas:

  • Inject some GO commands every thousand or few thousand lines. Then instead of one ginormous batch it is broken up into multiple batches.
  • Change your individual INSERT statements to INSERT ... VALUES () with a thousand sets each.
  • Use transactions and commit and/or checkpoint gratuitously (again, every 1000 inserts or so is probably a good place to start, in combination with GO). Your watch and the log will thank you.
  • It's a silly trivial little thing, but make sure your script has SET NOCOUNT ON - otherwise the UI, SQL Server, and the network in between spend a lot of time sending 1 row(s) affected messages back and forth for every single insert.
  • Best is probably to use BULK INSERT, BCP or maybe even SSIS instead of Management Studio. Part of the problem is probably the memory overhead on your own machine just loading that huge file, never mind trying to parse/execute it. I'd be curious to see private bytes for ssms.exe while this is going on...
  • Based on offline discussion with the OP, adding another option: ask the vendor for a backup instead of a script. Then you can restore that to some test or dummy system, and generate inserts from that manually (or use import/export, or Red Gate's data compare tool). This may end up being net same amount of time, but won't kill the log with that huge script of millions of inserts...