Sql-server – Join To a @Table Variable is running ineficiently

sql serversql-server-2008stored-proceduresview

Alright so I have a report stored procedure that was running incredibly slow. Customers were complaining that the report would not run so I started investigating exactly where in the stored procedure the problem was and I found this portion taking up 99.8% of the time.

DECLARE @xmlTemp TABLE (
  CompanyID  INT,
  StoreID    INT,
  StartDate  DATETIME,
  DateStaID  INT,
  EndDate    DATETIME,
  DateEndID  INT,
  LastUpdate DATETIME)

INSERT INTO @xmlTemp
VALUES      (50,
             2,
             '3/3/2013',
             0,
             '3/3/2013',
             0,
             '3/3/2013')

SELECT DISTINCT T.CompanyID,
                CompanyName,
                Z.StoreID,
                StoreName,
                CashedOutBy,
                TransactionID,
                RegisterID,
                BusinessDate,
                CashedOut,
                Total - Isnull((SELECT Sum(DISTINCT PaymentAmount)
                                FROM   vPullDrawerPayments
                                WHERE  CompanyID = T.CompanyID
                                       AND StoreID = T.StoreID
                                       AND TransactionID = T.TransactionID
                                       AND Isnull(PaymentType, 1) <> 1), 0) AS PaymentAmount,
                'Cash'                                                      AS PaymentDesc,
                CASE
                  WHEN Z.EndDate >= Z.LastUpdate THEN 1
                  ELSE 0
                END                                                         AS MissingData
FROM   vPullDrawerPayments AS T
       INNER JOIN @xmlTemp AS Z
         ON T.CompanyID = Z.CompanyID
            AND T.StoreID = Z.StoreID
WHERE  BusinessDate BETWEEN Z.StartDate AND Z.EndDate
UNION ALL
SELECT DISTINCT NC.CompanyID,
                CompanyName,
                Z.StoreID,
                StoreName,
                CashedOutBy,
                TransactionID,
                RegisterID,
                BusinessDate,
                CashedOut,
                PaymentAmount,
                PaymentDesc,
                CASE
                  WHEN Z.EndDate >= Z.LastUpdate THEN 1
                  ELSE 0
                END AS MissingData
FROM   vPullDrawerPayments AS NC
       INNER JOIN @xmlTemp AS Z
         ON NC.CompanyID = Z.CompanyID
            AND NC.StoreID = Z.StoreID
WHERE  BusinessDate BETWEEN Z.StartDate AND Z.EndDate
       AND Isnull(PaymentType, 1) <> 1
UNION ALL
SELECT DISTINCT C.CompanyID,
                CompanyName,
                Z.StoreID,
                StoreName,
                CashedOutBy,
                TransactionID,
                RegisterID,
                BusinessDate,
                CashedOut,
                Sum(Abs(LineAmount)) AS PaymentAmount,
                'Coupons'            AS PaymentDesc,
                CASE
                  WHEN Max(Z.EndDate) >= Max(Z.LastUpdate) THEN 1
                  ELSE 0
                END                  AS MissingData
FROM   vPullDrawerPayments AS C
       INNER JOIN @xmlTemp AS Z
         ON C.CompanyID = Z.CompanyID
            AND C.StoreID = Z.StoreID
WHERE  BusinessDate BETWEEN Z.StartDate AND Z.EndDate
GROUP  BY C.CompanyID,
          CompanyName,
          Z.StoreID,
          StoreName,
          CashedOutBy,
          TransactionID,
          RegisterID,
          BusinessDate,
          CashedOut 

The @xmlTemp portion of this query is normally used to take parameters from our web application and turn them into parameters the report can actually use. For the sake of testing this I am just inserting values that run this for one store for one day. Running this portion can take upwards of 20 minutes.

So I ran this query plan through PlanExplorer and saw it was pulling all of the data from two of my fact tables instead of filtering out just that store and that day. As seen in the picture below.

QueryPlan
Obviously this is bad. So the next step I took is to cut out the join the @xml temp and just manually put in the values in the queries WHERE clause to see how well that worked.

SELECT DISTINCT T.CompanyID,
                CompanyName,
                T.StoreID,
                StoreName,
                CashedOutBy,
                TransactionID,
                RegisterID,
                BusinessDate,
                CashedOut,
                Total - Isnull((SELECT Sum(DISTINCT PaymentAmount)
                                FROM   vPullDrawerPayments
                                WHERE  CompanyID = T.CompanyID
                                       AND StoreID = T.StoreID
                                       AND TransactionID = T.TransactionID
                                       AND Isnull(PaymentType, 1) <> 1), 0) AS PaymentAmount,
                'Cash'                                                      AS PaymentDesc
--CASE WHEN Z.'3/3/2013' >= Z.LastUpdate THEN 1 ELSE 0 END AS MissingData
FROM   vPullDrawerPayments AS T
WHERE  CompanyID = 50
       AND StoreID = 1
       AND BusinessDate BETWEEN '3/3/2013' AND '3/3/2013'
UNION ALL
SELECT DISTINCT NC.CompanyID,
                CompanyName,
                NC.StoreID,
                StoreName,
                CashedOutBy,
                TransactionID,
                RegisterID,
                BusinessDate,
                CashedOut,
                PaymentAmount,
                PaymentDesc
--CASE WHEN Z.'3/3/2013' >= Z.LastUpdate THEN 1 ELSE 0 END AS MissingData
FROM   vPullDrawerPayments AS NC
WHERE  CompanyID = 50
       AND StoreID = 1
       AND BusinessDate BETWEEN '3/3/2013' AND '3/3/2013'
       AND Isnull(PaymentType, 1) <> 1
UNION ALL
SELECT DISTINCT C.CompanyID,
                CompanyName,
                C.StoreID,
                StoreName,
                CashedOutBy,
                TransactionID,
                RegisterID,
                BusinessDate,
                CashedOut,
                Sum(Abs(LineAmount)) AS PaymentAmount,
                'Coupons'            AS PaymentDesc
--CASE WHEN MAX(Z.'3/3/2013') >= MAX(Z.LastUpdate) THEN 1 ELSE 0 END AS MissingData
FROM   vPullDrawerPayments AS C

WHERE  CompanyID = 50
       AND StoreID = 1
       AND BusinessDate BETWEEN '3/3/2013' AND '3/3/2013'
GROUP  BY C.CompanyID,
          CompanyName,
          C.StoreID,
          StoreName,
          CashedOutBy,
          TransactionID,
          RegisterID,
          BusinessDate,
          CashedOut 

By changing this to a simple where clause it runs in 4 seconds instead of 20 minutes and up. Also the query Plan is showing the correct Is there any reason I should be seeing this behavior?

Edit here is the full Link to the QueryPlan.

Best Answer

[Copying from my answer on SQLPerformance.com.]

Some very brief initial suggestions from discussions elsewhere:

  1. Try creating @xmlTemp as a #temp table with a clustered index on (StartDate, EndDate) instead of a table variable. This may provide SQL Server with more accurate stats information (though questionably useful if the table only has one row).
  2. If @xmlTemp has only one row always, use two variables instead of a table in the first place.
  3. Try adding the (RECOMPILE) option to the statement, especially if you convert to variables instead of the #temp table (parameter sniffing).
  4. Try using OPTION (MAXDOP 1) - parallelism is definitely in use, and at the lower end the threads seem partially imbalanced. I wonder if parallelism is helping or hurting here - can't hurt to test duration with and without.
  5. You may need to perform more rigorous stats updates. A lot of these estimates are way, way off.
  6. Remove the DISTINCTs. For this set of columns I find it hard to believe this is eliminating any duplicates, but the optimizer has to work as if there are dupes to remove.
  7. Consider using Table-Valued Parameters (TVPs) instead of shredding XML for the different companies / stores.