SQL Server – How to Fix Simultaneous Parallel Table Updates?

sql serversql server 2014

I have to update all records (add Guids) on two (indexed) empty columns of 150 tables, each table with around 50k records (using a script to create 40k updates at once in c# and post it to the server) and exactly 4 existing columns.

On my local machine (16GB RAM, 500GB Samsung 850, SQL Server 2014, core i5) when I try to run 10 tables in parallel it takes a total of 13 minutes, while if I run 5 the process finishes in mere 1.7 minutes.

I do understand that something is busy on the disk level, but I need some help in how to quantify this huge difference in timings.

Is there a exact SQL Server DB view that I can check this discrepancy? Is there an exact way to figure out for a given hardware how many table updates can I run in parallel?? (the real test server has more RAM and 10k rpm disks).

Can anyone point to something that I can improve on the SQL Server to improve the timings for the running 10 tables in parallel?

I already tried increasing the Auto Growth size to 100MB from 10MB which improves the Disk Queue length (from around 5 to 0.1) but it does not actually decrease the total time that much.

I have asked the exact same question on stackoverflow, but not getting any helpful answers so far, so some or any insight/help would be immensely helpful. 🙂

Best Answer

Given the code in your answer, you would most likely improve performance by doing the following two changes:

Start the query batch with BEGIN TRAN and end the batch with COMMIT TRAN:
Decrease the number of updates per batch to less than 5000 to avoid lock escalation (which generally occurs at 5000 locks). Try 4500.

Doing those two things should decrease the massive amount of tran log writes and lock / unlock operations that you are currently generating by doing individual DML statements.

Example:

conn.Open();
using (SqlCommand cmd = new SqlCommand(
      @"BEGIN TRAN;
        UPDATE [TestTable] SET Column5 = 'some unique value' WHERE ID = 1;
        UPDATE [TestTable] SET Column5 = 'some unique value' WHERE ID = 2;
        ...
        UPDATE [TestTable] SET Column5 = 'some unique value' WHERE ID = 4500;
        COMMIT TRAN;
        ", conn));

UPDATE

The Question is a bit sparse on the details. The example code is only shown in an answer.

One area of confusion is that the description mentions updating two columns, yet the example code only shows a single column being updated. My answer above was based on the code, hence it only shows a single column. If there really are two columns to update, then both columns should be updated in the same UPDATE statement:

conn.Open();
using (SqlCommand cmd = new SqlCommand(
      @"BEGIN TRAN;
        UPDATE [TestTable]
        SET    Column5 = 'some unique value',
               ColumnN = 'some unique value'
        WHERE  ID = 1;
        UPDATE [TestTable]
               SET Column5 = 'some unique value',
               SET ColumnN = 'some unique value'
        WHERE  ID = 2;
        ...
        UPDATE [TestTable]
               SET Column5 = 'some unique value',
               SET ColumnN = 'some unique value'
        WHERE  ID = 4500;
        COMMIT TRAN;
        ", conn));

Another issue that is unclear is where is the "unique" data coming from? The Question mentions that the unique values are GUIDs. Are these being generated in the app layer? Are they coming from another data source that the app layer knows about and the database does not? This is important because, depending on the answers to these questions, it might make sense to ask:

Can the GUIDs be generated in SQL Server instead?
If yes to #1, then is there any reason to generate this code from app code instead of doing a simple batch loop in T-SQL?

If "yes" to #1 but the code, for whatever reason, needs to be generated in .NET, then you can use NEWID() and generate UPDATE statements that work on ranges of rows, in which case you do not need the BEGIN TRAN / 'COMMIT` since each statement can handle all 4500 rows in one shot:

conn.Open();
using (SqlCommand cmd = new SqlCommand(
      @"UPDATE [TestTable]
        SET    Column5 = NEWID(),
               ColumnN = NEWID()
        WHERE  ID BETWEEN 1 and 4500;
        ", conn));

If "yes" to #1 and there is no real reason for these UPDATEs to be generated in .NET, then you can just do the following:

DECLARE @BatchSize INT = 4500, -- this could be an input param for a stored procedure
        @RowsAffected INT = 1, -- needed to enter loop
        @StartingID INT = 1; -- initial value

WHILE (@RowsAffected > 0)
BEGIN
  UPDATE TOP (@BatchSize) tbl
  SET    tbl.Column5 = NEWID(),
         tbl.ColumnN = NEWID()
  FROM   [TestTable] tbl
  WHERE  tbl.ID BETWEEN @StartingID AND (@StartingID + @BatchSize - 1);

  SET @RowsAffected = @@ROWCOUNT;
  SET @StartingID += @BatchSize;
END;

The code above only works if the ID values are not sparse, or at least if the values do not have gaps larger than @BatchSize, such that there is at least 1 row updated in each iteration. This code also assumes that the ID field is the Clustered Index. These assumptions seem reasonable given the provided example code.

However, if the ID values do have large gaps, or if the ID field is not the Clustered Index, then you can just test for rows that do not already have a value:

DECLARE @BatchSize INT = 4500, -- this could be an input param for a stored procedure
        @RowsAffected INT = 1; -- needed to enter loop

WHILE (@RowsAffected > 0)
BEGIN
  UPDATE TOP (@BatchSize) tbl
  SET    tbl.Column5 = NEWID(),
         tbl.ColumnN = NEWID()
  FROM   [TestTable] tbl
  WHERE  tbl.Col1 IS NULL;

  SET @RowsAffected = @@ROWCOUNT;
END;

BUT, if "no" to #1 and the values are coming from .NET for a good reason, such as the unique values per each ID already exist in another source, then you can still speed this up (beyond my initial suggestion) by supplying a derived table:

conn.Open();
using (SqlCommand cmd = new SqlCommand(
      @"BEGIN TRAN;

        UPDATE tbl
        SET    tbl.Column5 = tmp.Col1,
               tbl.ColumnN = tmp.Col2
        FROM   [TestTable] tbl
        INNER JOIN (VALUES
          (1, 'some unique value A', 'some unique value B'),
          (2, 'some unique value C', 'some unique value D'),
          ...
          (1000, 'some unique value N1', 'some unique value N2')
                   ) tmp (ID, Col1, Col2)
                ON tmp.ID = tbl.ID;

        UPDATE tbl
        SET    tbl.Column5 = tmp.Col1,
               tbl.ColumnN = tmp.Col2
        FROM   [TestTable] tbl
        INNER JOIN (VALUES
          (1001, 'some unique value A2', 'some unique value B2'),
          (1002, 'some unique value C2', 'some unique value D2'),
          ...
          (2000, 'some unique value N3', 'some unique value N4')
                   ) tmp (ID, Col1, Col2)
                ON tmp.ID = tbl.ID;

        COMMIT TRAN;
        ", conn));

I believe the limit on the number of rows that can be joined via VALUES is 1000, so I grouped two sets together in an explicit transaction. You could test with up to 4 sets of these UPDATEs to do 4000 per transaction and keep below the limit of 5000 locks.

Related Solutions

Sql-server – SQL: What is slowing down INSERTs if not CPU or IO

I know it's an old Question but this might still help searchers and it's a problem that pops up every now and then.

The main reason why you are hitting a performance ceiling without you seeing any resource bottleneck is because you've reached the limit of what is possible to process within one session single thread. The loop isn't processed in parallel, but all inserts are done serially.

In my case, it takes 36 seconds to insert 3 million rows. That means 36/30000000 = 0.000012 seconds per row. That's pretty fast. On my system, it simply takes 0.000012 to go through all the steps that are necessary.

The only way to get it done faster is start up a second session in parallel.

If I start 2 sessions in parallel both doing 15 million inserts. Both of them finish in 18 seconds. I could scale out more, but my current test setup is hitting 95% cpu with two parallel session, so doing 3 would skew the results since I would hit a CPU bottleneck.

If I start 2 parallel session both inserting 3 million rows, they both finish in 39 seconds. so that is now 6 million rows in 39 seconds.

Okay, that still leaves us with the NETWORK_IO wait showing up.

The NETWORK_IO waits are added by the fact that you are using extended events to trace them. In my case the insert takes 36 seconds (on avg). When using the extended event way (from the link above in the very first comment) this is what is registered:

Wait Type             Wait Count  Total Wait Time (ms) Total Resource Wait Time (ms) Total Signal Wait Time (ms)
NETWORK_IO            3455        68808                68802                         6
PAGEIOLATCH_SH        3           64                   64                            0
PAGEIOLATCH_UP        12          58                   58                            0
WRITE_COMPLETION      8           15                   15                            0
WRITELOG              3           9                    9                             0
PAGELATCH_UP          2           4                    4                             0
SOS_SCHEDULER_YIELD   32277       1                    0                             1
IO_COMPLETION         8           0                    0                             0
LATCH_SH              3           0                    0                             0
LOGBUFFER             1           0                    0                             0

You can see that 68 seconds of NETWORK_IO is registered. But since the insert loop is a single threaded action that took 36 seconds, this can't be. (Yes, multiple threads are used, but the operations are serial, never in parallel, so you can't acummulate more wait time than the total duration of the query)

If I don't use extended events but just the wait stats DMVs on a quiet instance (with just me running the insert) I get this:

Wait Type                   Wait Count  Total Wait Time (ms)  Total Resource Wait Time (ms) Signal Resource Wait Time (ms)
SOS_SCHEDULER_YIELD             8873                 0.21                                    0.01                                    0.20
PAGEIOLATCH_UP                  3                    0.02                                    0.02                                    0.00
PREEMPTIVE_OS_AUTHENTICATIONOPS 17                   0.02                                    0.02                                    0.00
PAGEIOLATCH_SH                  1                    0.00                                    0.00                                    0.00

So the NETWORK_IO you were seeing in the extended events log, wasn't related to your insert loop. (If you wouldn't turn nocount on, you would have massive async network IO waits, +1 Martin)

However I don't know why the NETWORK_IO show up in the extended event trace. Sure the writing out to a async file target of the events accumulates ASYNC_NETWORK_IO, but surely this is all done on a differenent SPID then the one we are filtering on. I might ask this as a new question myself)

Sql-server – How to boost performance of virgin queries in MS SQL Server

Your question can be basically rephrased as 'How does the query memory grant work?'. A good read on the subject is Understanding SQL server memory grant. Before a query is launched into execution it may require a memory grant for sorts and hashes and other memory hungry operations. This memory grant is an estimate. Based on current system state (number of requests running and pending, memory available etc) the system grants the query a memory grant up to the required amount. Once the memory is granted, the query starts execution (it may have to wait in the dreaded 'resource semaphore' queue before it gets the grant). At execution it memory grant is guaranteed by the system. This amount of memory can be shared with data pages (since they can always flush to disk) but never with other memory usage (ie. it cannot be subject 'steal'). So when the query starts asking for committed memory from its grant, the engine will deploy what you call 'strategy #1': data pages may be evicted (flushed if dirty) in order to give the query the memory it was promised. Now if the estimate was correct and the grant was 100% of the requested memory, the query should not 'spill'. But if the estimate was incorrect (boils down to cardinality estimates, therefore is subject to stale stats) or if the query did not got the entire grant it had asked for, the query will 'spill'. This is when tempdb comes into picture and performance usually tanks.

The only knob you have at your disposal that controls something in this process is the Resource Governor. Since the RG can be used to specify a MIN setting for a pool, it can be used to reserve memory for a certain workload so that it actually gets the memory grant it requests. Of course, after you did the proper investigation that shows that reduced memory grants are the culprit, and of course after the impact on other workloads was evaluated. And tested, of course.

Now lets go back to your original question. If your investigation is correct (a very big if) I would like to point out two problems:

you run in production queries that require memory grants for a web site. This is a big no-no. Memory grants are indicative of analytical queries which have no place in serving HTTP requests.
your queries are probably not event getting the memory grant they request. Again, even more of a no-no for a latency critical workload as web sites are.

So what that tells me is that you have a fundamental design and architectural problem. Web sites are latency driven and should create an OLTP like workload, with no memory grants and with no memory pressure on queries. Not to mention no spills. Analytical queries should be run in offline jobs and store the pre-processed results for quick availability when HTTP requests desire them.

Best Answer

Related Solutions

Sql-server – SQL: What is slowing down INSERTs if not CPU or IO

Sql-server – How to boost performance of virgin queries in MS SQL Server

Related Question