Sql-server – get table and field names from union in execution plan

execution-plansql serverunion

I'm trying to find performance issues on query which is based on other queries etc.

I found that I have a union query which is slow, but I have no idea which union query/column it is.

All I get in the execution plan are nodes like:

 <SeekPredicates>
                                              <SeekPredicateNew>
                                                <SeekKeys>
                                                  <Prefix ScanType="EQ">
                                                    <RangeColumns>
                                                      <ColumnReference Database="[mydb]" Schema="[dbo]" Table="[Sales]" Column="SaleID" />
                                                    </RangeColumns>
                                                    <RangeExpressions>
                                                      <ScalarOperator ScalarString="[Union1399]">
                                                        <Identifier>
                                                          <ColumnReference Column="Union1399" />
                                                        </Identifier>
                                                      </ScalarOperator>
                                                    </RangeExpressions>
                                                  </Prefix>
                                                </SeekKeys>
                                              </SeekPredicateNew>
                                            </SeekPredicates>

How can I know which fields/tables make up this "union1399"? So can index it or remove it or otherwise optimize the query.

Thanks!

Best Answer

The answer lies in the system table/views.

Always refer to the official documentation from MSDN first, and search Dave Pinal second, as you will likely find these types of questions answered. ~ Helpful Hint #3 ;)

The system table you want is through sys.dm_exec_query_plan

This function stores the execution plans of the caches plans

And you can acquire the needed sql_plan varbinary value from one of three places:

sys.dm_exec_cached_plans
sys.dm_exec_query_stats
sys.dm_exec_requests

An example is like the following:

SELECT qt.text, qp.query_plan
FROM sys.dm_exec_query_stats qs
CROSS APPLY sys.dm_exec_sql_text(qs.sql_handle) qt
CROSS APPLY sys.dm_exec_query_plan(qs.plan_handle) qp
WHERE qt.text LIKE 'SELECT %'

I left the predicate in this example, but you can find different methods to achieve the same goal.

Anyways, the XML column query_plan will have all of the tables, columns, and even the values of any variables available in that plan. If you click on it, you get shown the visual query plan, but copy and pasting the field will have the actual XML plan used in the query.

Cheers.

Related Solutions

Sql-server – Execution plan shows expensive CONVERT_IMPLICIT operation. Can I fix this with indexing or do I need to change the table

You shouldn't rely too much on cost percentages in execution plans. These are always estimated costs, even in post-execution plans with 'actual' numbers for things like row counts. The estimated costs are based on a model that happens to work pretty well for the purpose it is intended for: enabling the optimizer to choose between different candidate execution plans for the same query. The cost information is interesting, and a factor to consider, but it should rarely be a primary metric for query tuning. Interpreting execution plan information requires a broader view of the presented data.

ItemTran Clustered Index Seek Operator

ItemTran Clustered Index Seek

This operator is really two operations in one. First an index seek operation finds all rows that match the predicate product_code_v42 = 'M10BOLT', then each row has the residual predicate bit_field_v41 & 4 = 0 applied. There is an implicit conversion of bit_field_v41 from its base type (tinyint or smallint) to integer.

The conversion occurs because the bitwise-AND operator (&) requires both operands to be of the same type. The implicit type of the constant value '4' is integer and the data type precedence rules mean the lower-priority bit_field_v41 field value is converted.

The problem (such as it is) is easily corrected by writing the predicate as bit_field_v41 & CONVERT(tinyint, 4) = 0 - meaning the constant value has the lower priority and is converted (during constant folding) rather than the column value. If the bit_field_v41 is tinyint no conversions occur at all. Likewise, CONVERT(smallint, 4) could be used if bit_field_v41 is smallint. That said, the conversion is not a performance issue in this case, but it is still good practice to match types and avoid implicit conversions where possible.

The major part of the estimated cost of this seek is down to the size of the base table. While the clustered index key is itself reasonably narrow, the size of each row is large. A definition for the table is not given, but just the columns used in the view add up to a significant row width. Since the clustered index includes all columns, the distance between clustered index keys is the width of the row, not the width of the index keys. The use of version suffixes on some columns suggests the real table has even more columns for previous versions.

Looking at the seek, residual predicate and output columns, the performance of this operator could be checked in isolation by building the equivalent query (the 1 <> 2 is a trick to prevent auto-parameterization, the contradiction is removed by the optimizer and does not appear in the query plan):

SELECT
    it.booking_no_v32,
    it.QtyCheckedOut,
    it.QtyReturned,
    it.Trans_qty,
    it.trans_type_v41
FROM dbo.tblItemTran AS it
WHERE
    1 <> 2
    AND it.product_code_v42 = 'M10BOLT'
    AND it.bit_field_v41 & CONVERT(tinyint, 4) = 0;

The performance of this query with a cold data cache is of interest, since read-ahead would be affected by table (clustered index) fragmentation. The clustering key for this table invites fragmentation, so it could be important to maintain (reorganize or rebuild) this index regularly, and use an appropriate FILLFACTOR to allow space for new rows between index maintenance windows.

I performed a test of the effect of fragmentation on read-ahead using sample data generated using SQL Data Generator. Using the same table row counts as shown in the question's query plan, a highly fragmented clustered index resulted in SELECT * FROM view taking 15 seconds after DBCC DROPCLEANBUFFERS. The same test in the same conditions with a freshly-rebuilt clustered index on the ItemTrans table completed in 3 seconds.

If the table data is typically entirely in cache the fragmentation issue is very much less important. But, even with low fragmentation, the wide table rows might mean the number of logical and physical reads is much higher than might be expected. You could also experiment with adding and removing the explicit CONVERT to validate my expectation that the implicit conversion issue is not important here, except as a best practice violation.

More to the point is the estimated number of rows leaving the seek operator. The optimization-time estimate is 165 rows, but 4,226 were produced at execution time. I will return to this point later, but the main reason for the discrepancy is that the selectivity of the residual predicate (involving the bitwise-AND) is very hard for the optimizer to predict - in fact it resorts to guessing.

Filter Operator

Filter Operator

I am showing the filter predicate here mostly to illustrate how the two NOT IN lists are combined, simplified and then expanded, and also to provide a reference for the following hash match discussion. The test query from the seek can be expanded to incorporate its effects and determine the effect of the Filter operator on performance:

SELECT
    it.booking_no_v32,
    it.trans_type_v41,
    it.Trans_qty,
    it.QtyReturned,
    it.QtyCheckedOut
FROM dbo.tblItemTran AS it
WHERE
    it.product_code_v42 = 'M10BOLT'
    AND it.bit_field_v41 & CONVERT(tinyint, 4) = 0
    AND
    (
        (
            it.trans_type_v41 NOT IN (2, 3, 6, 7, 18, 19, 20, 21, 12, 13, 22)
            AND it.trans_type_v41 NOT IN (6, 7)
        )
        OR
        (
            it.trans_type_v41 NOT IN (6, 7)
        )
        OR 
        (
            it.trans_type_v41 IN (6, 7)
            AND it.QtyCheckedOut = 0
        )
        OR 
        (
            it.trans_type_v41 IN (6, 7)
            AND it.QtyCheckedOut > 0
            AND it.trans_qty - (it.QtyCheckedOut - it.QtyReturned) > 0
        )
    );

The Compute Scalar operator in the plan defines the following expression (the calculation itself is deferred until the result is required by a later operator):

[Expr1016] = (trans_qty - (QtyCheckedOut - QtyReturned))

The Hash Match Operator

Performing a join on character data types is not the reason for the high estimated cost of this operator. The SSMS tooltip shows only a Hash Keys Probe entry, but the important details are in the SSMS Properties window.

The Hash Match operator builds a hash table using the values of the booking_no_v32 column (Hash Keys Build) from the ItemTran table, and then probes for matches using the booking_no column (Hash Keys Probe) from the Bookings table. The SSMS tooltip would also normally show a Probe Residual, but the text is far too long for a tooltip and is simply omitted.

A Probe Residual is similar to the Residual seen after the index seek earlier; the residual predicate is evaluated on all rows that hash match to determine if the row should be passed to the parent operator. Finding hash matches in a well-balanced hash table is extremely fast, but applying a complex residual predicate to each row that matches is quite slow by comparison. The Hash Match tooltip in Plan Explorer shows the details, including the Probe Residual expression:

Hash Match Operator

The residual predicate is complex and includes the booking progress status check now that column is available from the bookings table. The tooltip also shows the same discrepancy between estimated and actual rows counts seen earlier in the index seek. It may seem odd that much of the filtering is performed twice, but this is just the optimizer being optimistic. It does not expect the parts of the filter which can be pushed down the plan from the probe residual to eliminate any rows (the row count estimates are the same before and after the filter) but the optimizer knows it might be wrong about that. The chance of filtering rows early (reducing the cost of the hash join) is worth the small cost of the extra filter. The whole filter cannot be pushed down because it includes a test on a column from the bookings table, but most of it can be. The full test is still required in the hash residual test for correctness.

The row count underestimate is an issue for the Hash Match operator because the amount of memory reserved for the hash table is based on the estimated number of rows. Where the memory is too small for the size of hash table required at run time (due to the larger number of rows) the hash table recursively spills to physical tempdb storage, often resulting in very poor performance. In the worst case, the execution engine stops recursively spilling hash buckets and resorts to a very slow bailout algorithm. Hash spilling (recursive or bailout) is the most likely cause of the performance problems outlined in the question (not character-type join columns or implicit conversions). The root cause would be the server reserving too little memory for the query based on incorrect row count (cardinality) estimation.

Sadly, before SQL Server 2012, there is no indication in the execution plan that a hashing operation exceeded its memory allocation (which cannot dynamically grow after being reserved before execution starts, even if the server has masses of free memory) and had to spill to tempdb. It is possible to monitor the Hash Warning Event Class using Profiler, but it can be difficult to correlate the warnings with a particular query.

Correcting the problems

The three issues are fragmentation, the complex probe residual in the hash match operator and the incorrect cardinality estimate resulting from the guessing at the index seek.

Recommended solution

Check the fragmentation and correct it if necessary, scheduling maintenance to ensure the index stays acceptably organized. The usual way to correct the cardinality estimate is to provide statistics. In this case, the optimizer needs statistics for the combination (product_code_v42, bitfield_v41 & 4 = 0). We cannot create statistics on an expression directly, so we must first create a computed column for the bit field expression, and then create the manual multi-column statistics:

ALTER TABLE dbo.tblItemTran
ADD Bit3 AS bit_field_v41 & CONVERT(tinyint, 4);

CREATE STATISTICS [stats dbo.ItemTran (product_code_v42, Bit3)]
ON dbo.tblItemTran (product_code_v42, Bit3);

The computed column text definition must match the text in the view definition pretty much exactly for the statistics to be used, so correcting the view to eliminate the implicit conversion should be done at the same time, and care taken to ensure a textual match.

The multi-column statistics ought to result in much better estimates, greatly reducing the chance that the hash match operator will use recursive spilling or the bailout algorithm. Adding the computed column (which is a metadata-only operation, and takes no space in the table since it is not marked PERSISTED) and the multi-column statistics is my best guess at a first solution.

When solving query performance problems, it is important to measure things like elapsed time, CPU usage, logical reads, physical reads, wait types and durations...and so on. It can also be useful to run parts of the query separately to validate suspected causes as shown above.

In some environments, where an up-to-the-second view of the data is not important, it can be useful to run a background process that materializes the whole view into a snapshot table every so often. This table is just a normal base table and can be indexed for read queries without worrying about impacting update performance.

View indexing

Do not be tempted to index the original view directly. Read performance will be amazingly fast (a single seek on a view index) but (in this case) all the performance problems in the existing query plans will be transferred to queries that modify any of the table columns referenced in the view. Queries that change base table rows will be impacted very badly indeed.

Advanced solution with a partial indexed view

There is a partial indexed-view solution for this particular query that corrects cardinality estimates and removes the filter and probe residual, but it is based on some assumptions about the data (mostly my guess at the schema) and requires expert implementation, particularly regarding suitable indexes to support the indexed view maintenance plans. I share the code below for interest, I do not propose you implement it without very careful analysis and testing.

-- Indexed view to optimize the main view
CREATE VIEW dbo.V1
WITH SCHEMABINDING
AS
SELECT
    it.ID,
    it.product_code_v42,
    it.trans_type_v41,
    it.booking_no_v32,
    it.Trans_qty,
    it.QtyReturned,
    it.QtyCheckedOut,
    it.QtyReserved,
    it.bit_field_v41,
    it.prep_on,
    it.From_locn,
    it.Trans_to_locn,
    it.PDate,
    it.FirstDate,
    it.PTimeH,
    it.PTimeM,
    it.RetnDate,
    it.BookDate,
    it.TimeBookedH,
    it.TimeBookedM,
    it.TimeBookedS,
    it.del_time_hour,
    it.del_time_min,
    it.return_to_locn,
    it.return_time_hour,
    it.return_time_min,
    it.AssignTo,
    it.AssignType,
    it.InRack
FROM dbo.tblItemTran AS it
JOIN dbo.tblBookings AS tb ON
    tb.booking_no = it.booking_no_v32
WHERE
    (
        it.trans_type_v41 NOT IN (2, 3, 7, 18, 19, 20, 21, 12, 13, 22)
        AND it.trans_type_v41 NOT IN (6, 7)
        AND it.bit_field_v41 & CONVERT(tinyint, 4) = 0
    )
    OR
    (
        it.trans_type_v41 NOT IN (6, 7)
        AND it.bit_field_v41 & CONVERT(tinyint, 4) = 0
        AND tb.BookingProgressStatus = 1
    )
    OR 
    (
        it.trans_type_v41 IN (6, 7)
        AND it.bit_field_v41 & CONVERT(tinyint, 4) = 0
        AND it.QtyCheckedOut = 0
    )
    OR 
    (
        it.trans_type_v41 IN (6, 7)
        AND it.bit_field_v41 & CONVERT(tinyint, 4) = 0
        AND it.QtyCheckedOut > 0
        AND it.trans_qty - (it.QtyCheckedOut - it.QtyReturned) > 0
    );
GO
CREATE UNIQUE CLUSTERED INDEX cuq ON dbo.V1 (product_code_v42, ID);
GO

The existing view tweaked to use the indexed view above:

CREATE VIEW [dbo].[vwReallySlowView2]
AS
SELECT
    I.booking_no_v32 AS bkno,
    I.trans_type_v41 AS trantype,
    B.Assigned_to_v61 AS Assignbk,
    B.order_date AS dateo,
    B.HourBooked AS HBooked,
    B.MinBooked AS MBooked,
    B.SecBooked AS SBooked,
    I.prep_on AS Pon,
    I.From_locn AS Flocn,
    I.Trans_to_locn AS TTlocn,
    CASE I.prep_on 
        WHEN 'Y' THEN I.PDate
        ELSE I.FirstDate
    END AS PrDate,
    I.PTimeH AS PrTimeH,
    I.PTimeM AS PrTimeM,
    CASE
        WHEN I.RetnDate < I.FirstDate 
        THEN I.FirstDate 
        ELSE I.RetnDate
    END AS RDatev,
    I.bit_field_v41 AS bitField,
    I.FirstDate AS FDatev,
    I.BookDate AS DBooked,
    I.TimeBookedH AS TBookH,
    I.TimeBookedM AS TBookM,
    I.TimeBookedS AS TBookS,
    I.del_time_hour AS dth,
    I.del_time_min AS dtm,
    I.return_to_locn AS rtlocn,
    I.return_time_hour AS rth,
    I.return_time_min AS rtm,
    CASE
        WHEN
            I.Trans_type_v41 IN (6, 7) 
            AND I.Trans_qty < I.QtyCheckedOut
            THEN 0 
        WHEN
            I.Trans_type_v41 IN (6, 7)
            AND I.Trans_qty >= I.QtyCheckedOut
            THEN I.Trans_Qty - I.QtyCheckedOut
        ELSE
            I.trans_qty
    END AS trqty,
    CASE
        WHEN I.Trans_type_v41 IN (6, 7)
        THEN 0
        ELSE I.QtyCheckedOut
    END AS MyQtycheckedout,
    CASE
        WHEN I.Trans_type_v41 IN (6, 7)
        THEN 0
        ELSE I.QtyReturned
    END AS retqty,
    I.ID,
    B.BookingProgressStatus AS bkProg,
    I.product_code_v42,
    I.return_to_locn,
    I.AssignTo,
    I.AssignType,
    I.QtyReserved,
    B.DeprepOn,
    CASE B.DeprepOn
        WHEN 1 THEN B.DeprepDateTime
        ELSE I.RetnDate
    END AS DeprepDateTime,
    I.InRack
FROM dbo.V1 AS I WITH (NOEXPAND)
JOIN dbo.tblbookings AS B ON
    B.booking_no = I.booking_no_v32
JOIN dbo.tblInvmas AS M ON
    I.product_code_v42 = M.product_code;

Example query and execution plan:

SELECT
    vrsv.*
FROM dbo.vwReallySlowView2 AS vrsv
WHERE vrsv.product_code_v42 = 'M10BOLT';

New execution plan

In the new plan, the hash match has no residual predicate, there is no complex filter, no residual predicate on the indexed view seek, and the cardinality estimates are exactly correct.

As an example of how insert/update/delete plans would be affected, this is the plan for an insert to the ItemTrans table:

Insert plan

The highlighted section is new and required for indexed view maintenance. The table spool replays inserted base table rows for indexed view maintenance. Each row is joined to the bookings table using a clustered index seek, then a filter applies the complex WHERE clause predicates to see if the row needs to be added to the view. If so, an insert is performed to the view's clustered index.

The same SELECT * FROM view test performed earlier completed in 150ms with the indexed view in place.

Final thing: I notice your 2008 R2 server is still at RTM. It won't fix your performance problems, but Service Pack 2 for 2008 R2 has been available since July 2012, and there are many good reasons to keep as current as possible with service packs.

Sql-server – Index lost while joining on union

Try pushing the predicates as close to the base tables as possible:

SELECT * 
FROM  (SELECT items.*, Row_number() OVER (partition BY itemid 
                                          ORDER BY origdate DESC
                                         )AS RowNum 
      FROM   items 
      WHERE  origdate < Getdate() 
        AND orighand = 930
      ) t1

Do the same for all legs of the union. If the predicates contain references to an an outer table as in:

SELECT ...
FROM T1
JOIN (
    SELECT ...
    FROM T2
    WHERE T2.x = 'WHATEVER'
) T3
    ON T3.y = T1.Y

You can use a LATERAL JOIN (CROSS APPLY in SQL-server) to be able to push this predicate inside the derived table:

SELECT ...
FROM T1
CROSS JOIN LATERAL (
    SELECT ...
    FROM T2
    WHERE T2.x = 'WHATEVER'
      AND T2.y = T1.Y
) T3

Best Answer

Related Solutions

Sql-server – Execution plan shows expensive CONVERT_IMPLICIT operation. Can I fix this with indexing or do I need to change the table

Sql-server – Index lost while joining on union

Related Question