Sql-server – Improve Insert performance for 200 Million rows

optimizationperformancequery-performancesql server

How can the following insert statement be improved – which varies from 20mins on certain days to 70-140mins on most other days.

SELECT DISTINCT 
    od.SourceID, 
    od.OrderID, 
    od.OrderLineNo, 
    CASE WHEN mbew.vprsv = 'S' THEN
        CASE WHEN isNumeric(mbew.stprs) = '1' THEN convert(decimal (15, 2), mbew.stprs)/peinh ELSE NULL END 
    WHEN mbew.vprsv = 'V' THEN 
        CASE WHEN isNumeric(mbew.verpr) = '1' THEN convert(decimal (15, 2), mbew.verpr)/peinh ELSE NULL END 
    ELSE NULL END as standardPrice, 
    mbew.vprsv,
    mbew.stprs,
    mbew.verpr,
    mbew.peinh,
    od.currencyCode,
    CAST(exchangerate1 AS FLOAT) * (CAST(erf.toFactor AS FLOAT) / CAST(erf.fromFactor AS FLOAT)) as exchangerate1,
    t001.waers,
    'X3' as xfactory

into TempCostTable

FROM Order_Detail od (nolock)

INNER JOIN Order_Header oh  (nolock)
    ON od.SourceID = oh.SourceID
    AND od.OrderID = oh.OrderID

INNER JOIN eord   (nolock)
    ON eord.matnr = od.ProductID
    and eord.flifn = 'X'

INNER JOIN zsm_eqmupd_cntrl zsm2    (nolock)
    ON eord.lifnr = 'IVG'+ zsm2.werks
    AND zsm2.xfactory = 'X'

INNER JOIN mbew mbew   (nolock)
    ON mbew.matnr = eord.matnr
    AND eord.lifnr = 'IVG' +  mbew.bwkey
    AND mbew.lvorm != 'X'

INNER JOIN t001k   (nolock)
    ON mbew.bwkey = t001k.bwkey

INNER JOIN t001  (nolock)
    ON t001k.bukrs = t001.bukrs

LEFT JOIN ExchangeRate_Detail exchrt (nolock)
    ON exchrt.SourceID = '000001'
    AND t001.waers = exchrt.currencycode
    AND od.currencyCode = exchrt.ToCurrencyCode
    AND CASE WHEN isdate(oh.orderDate) = 0 THEN getdate() ELSE oh.orderDate END BETWEEN exchrt.StartDate AND exchrt.EndDate 

LEFT JOIN ExchangeRate_Factor erf (nolock)
    ON erf.SourceID = '000001'
    AND t001.waers = erf.currencycode
    AND od.currencyCode = erf.ToCurrencyCode
    AND CASE WHEN isdate(oh.orderDate) IN ('0', '00000000') THEN getdate() ELSE oh.orderDate END BETWEEN erf.StartDate AND erf.EndDate

WHERE od.SourceID = '000001'

Best Answer

So many things...

If you have the Actual Execution Plan, please post it. I'm curious to see what indexes you have, and what the shape of the data is.

In the meantime, be aware that:

ON eord.lifnr = 'IVG'+ zsm2.werks

...isn't great for finding rows in zsm2. You may want to add:

AND eord.lifnr LIKE 'IVG%'
AND zsm2.werks = STUFF(eord.lifnr,1,3,'')

And you may as well put:

AND mbew.bwkey = zsm2.werks

Your types look like they need some attention, and of course your indexing strategy is bound to need some attention.

EDITED FOLLOWING PLAN UPLOAD

So, now that I've seen the plan. Yeah, it's ugly.

You should consider what is going to be a useful method for joining your main tables, and how you'd do this if you were solving it on paper.

Try indexes on both Order_Detail and Order_Header, on (SourceID, OrderID), including all the other columns that are needed. This will allow a Merge Join between those tables. Then making sure that you can quickly access the rows in each subsequent table, and you have a relatively good candidate for a nicer plan.

eg: an index on eord (flitnr, matnr, lifnr) --with the LIKE 'IVG%' thing I mentioned above.

There's going to be a bunch of pain around those join clauses with CASE. If you need it to be joining to exactly one, you may prefer to use OUTER APPLY (SELECT TOP (1) ... WHERE StartDate < CASE... oh.OrderDate...END ORDER BY StartDate DESC), with an index on (SourceID, CurrencyCode, ToCurrencyCode, StartDate DESC)... you don't really need EndDate.

I've seen worse, and at the end of the day, this should be quite tunable.