Sql-server – Three update query vs single update query performance

performancequery-performancesql serverupdate

Am trying to optimize a procedure. There are 3 different update queries present in the procedure.

update #ResultSet
set MajorSector = case 
        when charindex('  ', Sector) > 2 then rtrim(ltrim(substring(Sector, 0, charindex('  ', Sector)))) 
            else ltrim(rtrim(sector)) 
        end

update #ResultSet
set MajorSector = substring(MajorSector, 5, len(MajorSector)-4)
where left(MajorSector,4) in ('(00)','(01)','(02)','(03)','(04)','(05)','(06)','(07)','(08)','(09)')

update #ResultSet
set MajorSector = substring(MajorSector, 4, len(MajorSector)-3)
where left(MajorSector,3) in ('(A)','(B)','(C)','(D)','(E)','(F)','(G)','(H)','(I)','(J)','(K)','(L)','(M)','(N)','(O)','(P)','(Q)','(R)','(S)','(T)','(U)','(V)','(W)','(X)','(Y)','(Z)')

To complete all three update queries it takes less than 10 seconds.

Execution plan for all three update queries.

https://www.brentozar.com/pastetheplan/?id=r11BLfq7b

What I planned is to change the three different update queries into one single update query, so that the I/O can be reduced.

;WITH ResultSet
     AS (SELECT CASE
                  WHEN LEFT(temp_MajorSector, 4) IN ( '(00)', '(01)', '(02)', '(03)', '(04)', '(05)', '(06)', '(07)', '(08)', '(09)' ) 
                      THEN Substring(temp_MajorSector, 5, Len(temp_MajorSector) - 4)
                  WHEN LEFT(temp_MajorSector, 3) IN ( '(A)', '(B)', '(C)', '(D)','(E)', '(F)', '(G)', '(H)','(I)', '(J)', '(K)', '(L)','(M)', '(N)', '(O)', '(P)','(Q)', '(R)', '(S)', '(T)','(U)', '(V)', '(W)', '(X)','(Y)', '(Z)' ) 
                      THEN Substring(temp_MajorSector, 4, Len(temp_MajorSector) - 3)
                  ELSE temp_MajorSector
                END AS temp_MajorSector,
                MajorSector
         FROM   (SELECT temp_MajorSector = CASE
                                             WHEN Charindex('  ', Sector) > 2 THEN Rtrim(Ltrim(Substring(Sector, 0, Charindex('  ', Sector))))
                                             ELSE Ltrim(Rtrim(sector))
                                           END,
                        MajorSector
                 FROM   #ResultSet)a)
UPDATE ResultSet
SET    MajorSector = temp_MajorSector  

But this takes around 1 minute to complete. I checked the execution plan, it is identical as first update query.

Execution plan for above query:

https://www.brentozar.com/pastetheplan/?id=SJvttz9QW

Can somebody explain why it is slow?

Dummy data for testing:

If object_id('tempdb..#ResultSet') is not null
drop table #ResultSet


;WITH lv0 AS (SELECT 0 g UNION ALL SELECT 0)
    ,lv1 AS (SELECT 0 g FROM lv0 a CROSS JOIN lv0 b) -- 4
    ,lv2 AS (SELECT 0 g FROM lv1 a CROSS JOIN lv1 b) -- 16
    ,lv3 AS (SELECT 0 g FROM lv2 a CROSS JOIN lv2 b) -- 256
    ,lv4 AS (SELECT 0 g FROM lv3 a CROSS JOIN lv3 b) -- 65,536
    ,lv5 AS (SELECT 0 g FROM lv4 a CROSS JOIN lv4 b) -- 4,294,967,296
    ,Tally (n) AS (SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM lv5)
SELECT CONVERT(varchar(255), NEWID()) as Sector,cast('' as varchar(1000)) as MajorSector
into #ResultSet
FROM Tally
where  n <= 242906 -- my original table record count
ORDER BY n;

Note : Since this is not my original data the timings I have mentioned above could be little different. Still the single update query is much slower than the first three.

I tried executing the queries more than 10 times to make sure external factors should not affect the performance. All 10 times first three updates ran much faster than the last single update.

Best Answer

The first single update reads and writes every row from the table. The second and third then re-read and re-write a sub-set of those rows. Look at the Actual Number of Rows. When the three statements are combined into one, the optimizer figures that if it has to read everything to satisfy the first change then it can piggy-back off that for the second and third change.

Have a look at the XML version of the query plans, specifically the <ComputeScalar> operators and <ScalarOperator ScalarString=""> parts. In the original plan you'll see each is relatively simple and maps very closely to the SQL. For the all-in-one plan it's a monster. This is the optimizer re-writing the SQL into a logically equivalent form. A plan works1 by passing each row through the operators once. The optimizer's doing all the work it has to do to satisfy all three changes as that row passes through one time.

I'd expect the second query to be faster because the data is only read and written once whereas it is touched three times in the first.

As the second query has no predicates (no WHERE clause) the optimizer has no choice but read every single row and process it. I'm surprised the second form takes longer than the first. Are both starting from clean buffers? Is there other work happening on the system? Since it's reading and writing to a temp table the IO is happening in tempdb. Is there file growth or somesuch happening?

By one measure you have achieved your desired outcome. You say you want to make changes "so that the IO can be reduced." The all-in-one does less IO than the three separate statements do in total. I suspect what you really want, however, is reduced elapsed time, and this is obviously not happening.


1 more or less, lots of details omitted.


I ran your routine to generate test data then ran the three single-update statements and the all-in-one statement. Although there are some differences (no clustered index, no parallelism) I get more-or-less the same results. Specifically, the plans are about the same shape and the three individual queries complete in about two seconds and the one big query in about thirty to thirty five seconds.

I set

set nocount off;
set statistics io on;
set statistics time on;

With the plan in cache and the data in memory I get:

SQL Server parse and compile time: 
   CPU time = 0 ms, elapsed time = 0 ms.
Table '#ResultSet...'. Scan count 1, logical reads 125223, physical reads 0

 SQL Server Execution Times:
   CPU time = 1422 ms,  elapsed time = 1417 ms.

(242906 row(s) affected)

Table '#ResultSet...'. Scan count 1, logical reads 125223, physical reads 0

 SQL Server Execution Times:
   CPU time = 344 ms,  elapsed time = 337 ms.

(0 row(s) affected)

Table '#ResultSet...'. Scan count 1, logical reads 125223, physical reads 0

 SQL Server Execution Times:
   CPU time = 734 ms,  elapsed time = 747 ms.

(0 row(s) affected)

I've removed some bits that aren't relevant. Since physical reads is zero for all three the table fits in memory. logical reads is the same for all three which makes sense. As there are no indexes the only approach is to scan every row of the table. The second and third query affect zero rows because I'd run them a few times already. CPU time and elapsed work out as 2500ms.

For the bigger query it is

SQL Server parse and compile time: 
   CPU time = 0 ms, elapsed time = 0 ms.
Table '#ResultSet...'. Scan count 1, logical reads 125223

 SQL Server Execution Times:
   CPU time = 33093 ms,  elapsed time = 33137 ms.

(242906 row(s) affected)

The same number of pages are read, the same number of rows are updated. The huge differences is the CPU time. This is reflected in casual observation of Task Manager which shows 30% utilisation for the duration of query execution. The question is, why does it take so much?

The individual queries separately have simple calculations and two of the statements have predicates that greatly reduce the number of rows touched. The optimizer has good heuristics for processing these and finds a quick plan. The all-in-one query applies the monster Compute Scalar against every single row. My suggestion is that, for whatever reason, the optimizer cannot unravel the logic into a plan that's quick to run and ends up using a lot of CPU. The optimizer has to work with what its given, which in the second case is complex, nested SQL. Perhaps by refactoring the SQL the optimizer will follow different heuristics and achieve a better outcome? Perhaps some (filtered) indexes or (filtered) statistics will convince it to write a different plan. Maybe persisted computed columns would help it along. Perhaps you just need to give the optimizer what it needs and your first attempt really is the best that can be achieved and you need to find a way to run those three in parallel. Sorry I can't be more scientific.