Sql-server – Three update query vs single update query performance

performancequery-performancesql serverupdate

Am trying to optimize a procedure. There are 3 different update queries present in the procedure.

update #ResultSet
set MajorSector = case 
        when charindex('  ', Sector) > 2 then rtrim(ltrim(substring(Sector, 0, charindex('  ', Sector)))) 
            else ltrim(rtrim(sector)) 
        end

update #ResultSet
set MajorSector = substring(MajorSector, 5, len(MajorSector)-4)
where left(MajorSector,4) in ('(00)','(01)','(02)','(03)','(04)','(05)','(06)','(07)','(08)','(09)')

update #ResultSet
set MajorSector = substring(MajorSector, 4, len(MajorSector)-3)
where left(MajorSector,3) in ('(A)','(B)','(C)','(D)','(E)','(F)','(G)','(H)','(I)','(J)','(K)','(L)','(M)','(N)','(O)','(P)','(Q)','(R)','(S)','(T)','(U)','(V)','(W)','(X)','(Y)','(Z)')

To complete all three update queries it takes less than 10 seconds.

Execution plan for all three update queries.

https://www.brentozar.com/pastetheplan/?id=r11BLfq7b

What I planned is to change the three different update queries into one single update query, so that the I/O can be reduced.

;WITH ResultSet
     AS (SELECT CASE
                  WHEN LEFT(temp_MajorSector, 4) IN ( '(00)', '(01)', '(02)', '(03)', '(04)', '(05)', '(06)', '(07)', '(08)', '(09)' ) 
                      THEN Substring(temp_MajorSector, 5, Len(temp_MajorSector) - 4)
                  WHEN LEFT(temp_MajorSector, 3) IN ( '(A)', '(B)', '(C)', '(D)','(E)', '(F)', '(G)', '(H)','(I)', '(J)', '(K)', '(L)','(M)', '(N)', '(O)', '(P)','(Q)', '(R)', '(S)', '(T)','(U)', '(V)', '(W)', '(X)','(Y)', '(Z)' ) 
                      THEN Substring(temp_MajorSector, 4, Len(temp_MajorSector) - 3)
                  ELSE temp_MajorSector
                END AS temp_MajorSector,
                MajorSector
         FROM   (SELECT temp_MajorSector = CASE
                                             WHEN Charindex('  ', Sector) > 2 THEN Rtrim(Ltrim(Substring(Sector, 0, Charindex('  ', Sector))))
                                             ELSE Ltrim(Rtrim(sector))
                                           END,
                        MajorSector
                 FROM   #ResultSet)a)
UPDATE ResultSet
SET    MajorSector = temp_MajorSector

But this takes around 1 minute to complete. I checked the execution plan, it is identical as first update query.

Execution plan for above query:

https://www.brentozar.com/pastetheplan/?id=SJvttz9QW

Can somebody explain why it is slow?

Dummy data for testing:

If object_id('tempdb..#ResultSet') is not null
drop table #ResultSet


;WITH lv0 AS (SELECT 0 g UNION ALL SELECT 0)
    ,lv1 AS (SELECT 0 g FROM lv0 a CROSS JOIN lv0 b) -- 4
    ,lv2 AS (SELECT 0 g FROM lv1 a CROSS JOIN lv1 b) -- 16
    ,lv3 AS (SELECT 0 g FROM lv2 a CROSS JOIN lv2 b) -- 256
    ,lv4 AS (SELECT 0 g FROM lv3 a CROSS JOIN lv3 b) -- 65,536
    ,lv5 AS (SELECT 0 g FROM lv4 a CROSS JOIN lv4 b) -- 4,294,967,296
    ,Tally (n) AS (SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM lv5)
SELECT CONVERT(varchar(255), NEWID()) as Sector,cast('' as varchar(1000)) as MajorSector
into #ResultSet
FROM Tally
where  n <= 242906 -- my original table record count
ORDER BY n;

Note : Since this is not my original data the timings I have mentioned above could be little different. Still the single update query is much slower than the first three.

I tried executing the queries more than 10 times to make sure external factors should not affect the performance. All 10 times first three updates ran much faster than the last single update.

Best Answer

The first single update reads and writes every row from the table. The second and third then re-read and re-write a sub-set of those rows. Look at the Actual Number of Rows. When the three statements are combined into one, the optimizer figures that if it has to read everything to satisfy the first change then it can piggy-back off that for the second and third change.

Have a look at the XML version of the query plans, specifically the <ComputeScalar> operators and <ScalarOperator ScalarString=""> parts. In the original plan you'll see each is relatively simple and maps very closely to the SQL. For the all-in-one plan it's a monster. This is the optimizer re-writing the SQL into a logically equivalent form. A plan works¹ by passing each row through the operators once. The optimizer's doing all the work it has to do to satisfy all three changes as that row passes through one time.

I'd expect the second query to be faster because the data is only read and written once whereas it is touched three times in the first.

As the second query has no predicates (no WHERE clause) the optimizer has no choice but read every single row and process it. I'm surprised the second form takes longer than the first. Are both starting from clean buffers? Is there other work happening on the system? Since it's reading and writing to a temp table the IO is happening in tempdb. Is there file growth or somesuch happening?

By one measure you have achieved your desired outcome. You say you want to make changes "so that the IO can be reduced." The all-in-one does less IO than the three separate statements do in total. I suspect what you really want, however, is reduced elapsed time, and this is obviously not happening.

_{¹ more or less, lots of details omitted.}

I ran your routine to generate test data then ran the three single-update statements and the all-in-one statement. Although there are some differences (no clustered index, no parallelism) I get more-or-less the same results. Specifically, the plans are about the same shape and the three individual queries complete in about two seconds and the one big query in about thirty to thirty five seconds.

I set

set nocount off;
set statistics io on;
set statistics time on;

With the plan in cache and the data in memory I get:

SQL Server parse and compile time: 
   CPU time = 0 ms, elapsed time = 0 ms.
Table '#ResultSet...'. Scan count 1, logical reads 125223, physical reads 0

 SQL Server Execution Times:
   CPU time = 1422 ms,  elapsed time = 1417 ms.

(242906 row(s) affected)

Table '#ResultSet...'. Scan count 1, logical reads 125223, physical reads 0

 SQL Server Execution Times:
   CPU time = 344 ms,  elapsed time = 337 ms.

(0 row(s) affected)

Table '#ResultSet...'. Scan count 1, logical reads 125223, physical reads 0

 SQL Server Execution Times:
   CPU time = 734 ms,  elapsed time = 747 ms.

(0 row(s) affected)

I've removed some bits that aren't relevant. Since physical reads is zero for all three the table fits in memory. logical reads is the same for all three which makes sense. As there are no indexes the only approach is to scan every row of the table. The second and third query affect zero rows because I'd run them a few times already. CPU time and elapsed work out as 2500ms.

For the bigger query it is

SQL Server parse and compile time: 
   CPU time = 0 ms, elapsed time = 0 ms.
Table '#ResultSet...'. Scan count 1, logical reads 125223

 SQL Server Execution Times:
   CPU time = 33093 ms,  elapsed time = 33137 ms.

(242906 row(s) affected)

The same number of pages are read, the same number of rows are updated. The huge differences is the CPU time. This is reflected in casual observation of Task Manager which shows 30% utilisation for the duration of query execution. The question is, why does it take so much?

The individual queries separately have simple calculations and two of the statements have predicates that greatly reduce the number of rows touched. The optimizer has good heuristics for processing these and finds a quick plan. The all-in-one query applies the monster Compute Scalar against every single row. My suggestion is that, for whatever reason, the optimizer cannot unravel the logic into a plan that's quick to run and ends up using a lot of CPU. The optimizer has to work with what its given, which in the second case is complex, nested SQL. Perhaps by refactoring the SQL the optimizer will follow different heuristics and achieve a better outcome? Perhaps some (filtered) indexes or (filtered) statistics will convince it to write a different plan. Maybe persisted computed columns would help it along. Perhaps you just need to give the optimizer what it needs and your first attempt really is the best that can be achieved and you need to find a way to run those three in parallel. Sorry I can't be more scientific.

Related Solutions

Sql-server – Tuning a query with temp table join

Why does the inner join to a one record temp table make the query take so much longer time?

Without the join, the optimizer is smart enough to work out that it can find the minimum value by reading one row from the end of the index.

Unfortunately, it is not currently equipped to apply the same sort of logic when the query is more complicated (with a join or grouping clause, for example). To work around this limitation, you can rewrite the query to compute local minimums per row in the temporary table, then find the global minimum.

Perhaps the easiest way to express this in T-SQL is to use the APPLY operator:

SELECT
    -- Global minimum
    @tenor_from = MIN(MinMaturityPerCurveID.maturity_date)
FROM #source_price_curve_list AS SPCL
CROSS APPLY
(
    -- Minimum maturity_date per price_curve_id
    SELECT TOP (1) 
        SPC.maturity_date
    FROM  dbo.source_price_curve AS SPC
    WHERE
        SPC.source_curve_def_id = SPCL.price_curve_id
         and as_of_date >= @as_of_date_from 
    ORDER BY
        SPC.maturity_date ASC
) AS MinMaturityPerCurveID;

Good performance relies on there being many rows per price_curve_id. You may need an index of the form:

CREATE NONCLUSTERED INDEX
    [IX dbo.source_price_curve source_curve_def_id, maturity_date, as_of_date]
ON dbo.source_price_curve 
(
    source_curve_def_id,
    maturity_date,
    as_of_date
);

Sql-server – Speed up INSERT procedure

First let's do a quick summary of the query plan for that insert query. SQL Server scans all rows from HVentas_1 to create the hash build and finds 0 relevant rows. It then scans through TR_TAB_OLAP_HVentas and stops if/when it finds 5 distinct rows such that v.fecha > @FechaMovimiento and that don't match a row in HVentas_1. You have an index defined on the table but SQL Server doesn't use it in the query plan that you uploaded. The worst case for performance should be when @FechaMovimiento is a value that doesn't match any rows in the table. In that case SQL Server will scan all 70 million rows from the table only to insert 0 rows. The number of rows inserted isn't relevant here. What's important is the amount of work that SQL Server did to find the rows to insert.

For the actual plan that you uploaded, SQL Server had to scan 70037142 rows from the heap before it found five distinct rows. It seems as if you had a very selective value for the variable, but you're using a local variable in a stored procedure without a RECOMPILE hint. You could be running into an issue with parameter sniffing. In any case, try it again with a RECOMPILE hint. That will allow SQL Server to build a query plan that accounts for the value of the local variable. This should be fine unless you're running this stored procedure hundreds of times per second. With better information, the query optimizer can make more informed decisions about when to use an index. Using an index here is likely to lead to better performance.

As a side node, you have a CCI that only has ten rows. Will that table continue to be that small? If so that may not be a good candidate for columnstore.

Best Answer

Related Solutions

Sql-server – Tuning a query with temp table join

Sql-server – Speed up INSERT procedure

Related Question