Sql-server – INSERT performance degradation in SQL Server after a certain number of columns

insertperformancesql serversql server 2014

I’m working on an application which has a feature of export of tabular data into a specified database table. The app uses INSERT statements to export its data to a target database.

Insert is done through a batched INSERT statement with 100 rows per one SQL INSERT statement (for now I can’t use BULK INSERT or bcp).

I’ve noticed that export time raises disproportionately when the number of columns in the source data exceeds some number (that number is not a fixed one and depends on size of the values, number of rows in each INSERT and so on).

For example, export of 50 000 rows (500 INSERT statements with 100 rows in each) of random strings with 100 chars each and 100 rows per INSERT takes:

3 sec with 5 columns
6 sec with 10 columns
56 sec with 15 columns
77 sec with 20 columns

Note the difference in export time between 10 and 15 columns. I was expecting that export time of 15 columns will be 9-10 sec, but it’s actually 5 times longer. I was able to find similar performance degradation while testing export of other datasets.

In order to be sure that the problem is not on my side, I ran the sames set of INSERT statements through sqlcmd.exe. I got similar results.

Question: What can I do to make SQL Server to work with large number of columns as fast as with a small one? Or at least “move” the point of performance degradation to a larger number of columns?

Additional details:

  • INSERT queries were executed on a local SQL Server Express 2014 (64-bit), version 12.0.5000.0;
  • Database recovery model is set to Simple;
  • All the INSERT statements were wrapped in a single transaction (I tried to call COMMIT after each INSERT, but the results were pretty much the same);
  • Target table was created before each test. It was a simple table without any indexes, foreign keys, constraints, etc.;
  • Hard drive performance doesn’t seems to be the source of the problem because during the first two tests (with 5 and 10 columns) disk write speed of sqlservr.exe process was 10 times greater than during the last two cases.

Tables are created like this:

CREATE TABLE [Test_Table]
(
    [Column 1] VARCHAR(255),
    [Column 2] VARCHAR(255),
    [Column 3] VARCHAR(255),
    [Column 4] VARCHAR(255),
    [Column 5] VARCHAR(255)
)

The data looks like this (each cell actually contains 100 chars long string, all the string in the same row are equal):

+------------+------------+------------+------------+------------+
| [Column 1] | [Column 2] | [Column 3] | [Column 4] | [Column 5] |
+------------+------------+------------+------------+------------+
| R6YZ..uWaQ | R6YZ..uWaQ | R6YZ..uWaQ | R6YZ..uWaQ | R6YZ..uWaQ |
| DMNW..Kh0a | DMNW..Kh0a | DMNW..Kh0a | DMNW..Kh0a | DMNW..Kh0a |
| GKbg..yuap | GKbg..yuap | GKbg..yuap | GKbg..yuap | GKbg..yuap |
| pG+f..64bX | pG+f..64bX | pG+f..64bX | pG+f..64bX | pG+f..64bX |
| O2Q7..fTNF | O2Q7..fTNF | O2Q7..fTNF | O2Q7..fTNF | O2Q7..fTNF |

Here are two examples with reproduce the issue:

http://rextester.com/OZI56670 (10 columns, ~0,09 sec)
http://rextester.com/HLAP4972 (11 columns, ~0,45 sec)

Best Answer

The difference between the repro that you have posted with 10 columns and 100 rows and 11 columns and 100 rows is that the execution plan for the first one uses Simple Parameterization.

The actual execution plan for the 10 columns lists parameters from @1 to @1000.

11 * 100 is 1100. But one thousand seems to be the maximum number of parameters an auto parameterised query can reach.

You are doing 10 inserts for each. In the 10 column case the plan can be compiled once and reused for the other 9 inserts. In the 11 column case each insert statement needs to be compiled individually.

Moreover the process of compilation takes longer when SQL Server has the literal values to look at as it spends time working out properties of the group (or at least this used to be the case I'm not sure if this has changed in more recent versions).