Sql-server – sqlcmd slow against SQL Server 2008 R2

insertsql-server-2008

I am trying to import a longish SQL script (3.5 millions of lines, 1,5 GB in size) containing mainly short INSERT INTO statements with data using sqlcmd -E -d <database> -S 127.0.0.1 -i <scriptfile> and it is nearly crawling to a halt (approximatly 150 lines per second) without inferring any noticeable load on the SQL server host (all CPU cores are nearly idle, Disk throughput is at around 200 KB/s).

As the source file is located on an NFS share, I first suspected this to be the culprit, but an identical script available locally runs at just the same speed.

The database is basically empty, the tables are created by the very same script, no triggers or other fancyness – just the raw, mainly numeric or varchar data.

What is SQLCMD waiting for? Is there any way to speed it up?

Edit:

we are limited in our means to alter the data in the SQL script files. The data is provided by a third party for an import procedure. I believe it was originally exported using Management Studio 2005 "Script table as…" feature.

Editing the file is tedious due to sheer size – any operation takes forever with common text editors, although prepending a "SET NOCOUNT ON" was done by copying two files together – and it brought a speedup of approximately 50%.

Due to the text encoding (Unicode-LE) it is impossible to edit with the common set of GNU textutils (which would otherwise work fine with large files) without converting – and I am reluctant to converting because of data fidelity issues which are likely to come with it.

So I am somewhat stuck with how to apply the suggestions on inserting the BEGIN TRAN/COMMIT TRAN blocks or converting the individual inserts into larger sets.

Best Answer

Some ideas:

Inject some GO commands every thousand or few thousand lines. Then instead of one ginormous batch it is broken up into multiple batches.
~~Change your individual INSERT statements to INSERT ... VALUES () with a thousand sets each.~~
Use transactions and commit and/or checkpoint gratuitously (again, every 1000 inserts or so is probably a good place to start, in combination with GO). Your watch and the log will thank you.
It's a silly trivial little thing, but make sure your script has SET NOCOUNT ON - otherwise the UI, SQL Server, and the network in between spend a lot of time sending 1 row(s) affected messages back and forth for every single insert.
Best is probably to use BULK INSERT, BCP or maybe even SSIS instead of Management Studio. Part of the problem is probably the memory overhead on your own machine just loading that huge file, never mind trying to parse/execute it. I'd be curious to see private bytes for ssms.exe while this is going on...
Based on offline discussion with the OP, adding another option: ask the vendor for a backup instead of a script. Then you can restore that to some test or dummy system, and generate inserts from that manually (or use import/export, or Red Gate's data compare tool). This may end up being net same amount of time, but won't kill the log with that huge script of millions of inserts...

Related Solutions

Sql-server – SQL Agent powershell context reference

If you look in SQL Server BOL, SQL Server Agent provides a set of "tokens" that it will substitute into both the job step command text and the output file (the later will prevent the GUI "view" button from working). These tokens seem to work for any type of step except T-SQL.

https://docs.microsoft.com/en-us/sql/ssms/agent/use-tokens-in-job-steps#sql-server-agent-tokens

So, if you have a SQL 2008 PowerShell step, you can start it with:

$sqlInstance = "$(ESCAPE_DQUOTE(SRVR))"

You may need to use MACH (machine name) and INST (just instance name) instead, because with the default instance SRVR == MACH, but with named instances SRVR == MACH\INST.

Sql-server – SQL Server statements intermittently slow on SQL Server 2008 R2

Thank you for the detailed explanation of your problem (one of the best laid out questions actually).

WRITELOG is a very common type of wait, so don't worry about it. Looking at the SOS_SCHEDULER_YIELD indicate CPU pressure and also the CXPACKET, it is possible that there must be some missing indexes and you may be retrieving lot of data from the queries for an OLTP system. I suggest you to look at the Missing Indexes DMV and see if there are any indexes (almost sure there will be more than few) that are in the questionable procs.

http://sqlfool.com/2009/04/a-look-at-missing-indexes/

http://troubleshootingsql.com/2009/12/30/how-to-find-out-the-missing-indexes-on-a-sql-server-2008-or-2005-instance-along-with-the-create-index-commands/

Look for Jonathan Kehayias's post on sqlblog.com on this too.

Also, take a look at Parameter sniffing.

http://sommarskog.se/query-plan-mysteries.html

http://pratchev.blogspot.com/2007/08/parameter-sniffing.html

It's NOT a compete answer for your needs but a good starting point. Let us know if you need more details.

Edit:

Best Answer

Related Solutions

Sql-server – SQL Agent powershell context reference

Sql-server – SQL Server statements intermittently slow on SQL Server 2008 R2

Related Question