Sql-server – Apply cardinality estimation problem in SQL Server

cardinality-estimatescross-applyoptimizationquery-performancesql server

Now, I am faced with the problem of the logic of cardinality estimation that is not quite clear for me in a seemingly rather simple situation.
I encountered this situation at my work, therefore, for privacy reasons, I will provide only a general description of the problem below, however, for a more detailed analysis, I simulated this problem in the AdventureWorksDW training base.

There is a query of the following form:

SELECT <some columns>
FROM <some dates table>
CROSS APPLY(

    SELECT
        <some p columns>
    FROM <some table> p
    WHERE p.StartDate <= Dates.d
      AND p.EndDate >= Dates.d
) t

As you can see from the execution plan presented above, the cardinality estimator estimated the estimated number of rows in the Index Seek operation at 17,884,200 (corresponding to 2,980,700 per row from the outer part of the NL), which is quite close to the actual number.

Now I will modify the query and add to CROSS APPLY LEFT OUTER JOIN:

SELECT <some columns t>
FROM <some dates table>
CROSS APPLY(

    SELECT
        <some p columns>
    <some columns f>
    FROM <some table> p
    LEFT JOIN <some table> f ON p.key = f.key
        AND f.date = Dates.d
    WHERE p.StartDate <= Dates.d
      AND p.EndDate >= Dates.d
) t

This query gives the following plan:

Seeing the logical form of the query, it is logical to assume that the expected number of rows of the Index Seek operation will remain the same, although I understand that the route for finding the plan is different, however, it would seem that the part highlighted in red has not changed, the same predicates, etc. , but Index Seek's estimate is now 664,506 (corresponding to 110,751 per line from the external part of NL), which is a gross mistake and in the production environment can cause a serious tempdb spill data.

The above queries were executed on an instance of Sql Server 2012 (SP4) (KB4018073) – 11.0.7001.0 (x64).

To get more details and simplify the analysis, I simulated this problem in the AdventureWorksDW2017 database on an instance of SQL Server 2019 (RTM) – 15.0.2000.5 (X64), but I execute queries with the 9481 trace flag turned on to simulate a system with cardinality estimator version 70.

Below is a query with left outer join.

DECLARE @db DATE = '20130720'
DECLARE @de DATE = '20130802'

;WITH Dates AS(

    SELECT [FullDateAlternateKey] AS d
    FROM [AdventureWorksDW2017].[dbo].[DimDate]
    WHERE [FullDateAlternateKey] BETWEEN @db AND @de
)
SELECT *
FROM Dates
CROSS APPLY(

    SELECT
        p.[ProductAlternateKey]
       ,f.[OrderQuantity]
    FROM [AdventureWorksDW2017].[dbo].[DimProduct] p
    LEFT JOIN [AdventureWorksDW2017].[dbo].[FactInternetSales] f ON f.ProductKey = p.ProductKey
       AND f.[OrderDate] = Dates.d
    WHERE p.StartDate <= Dates.d
      AND ISNULL(p.EndDate, '99991231') >= Dates.d

) t
OPTION(QUERYTRACEON 9481 /*force legacy CE*/)

It is also worth noting that the following index was created on the DimProduct table:

CREATE NONCLUSTERED INDEX [Date_Indx] ON [dbo].[DimProduct]
(
    [StartDate] ASC,
    [EndDate] ASC
)
INCLUDE([ProductAlternateKey])

The query gives the following query plan: (1)

As you can see, the part of the query highlighted in red gives an estimate of 59,754 (~ 182 per row).
Now I’ll demonstrate a query plan without a left outer join. (2)

As you can see the part of the query highlighted in red gives a score of 97 565 (~ 297 per row), the difference is not so great however, the cardinality score for the filter (3) operator is significantly different ~ 244 per row versus ~ 54 in the query with left outer join.

(3) – Filter predicate:

isnull([AdventureWorksDW2017].[dbo].[DimProduct].[EndDate] as [p].[EndDate],'9999-12-31 00:00:00.000')>=[AdventureWorksDW2017].[dbo].[DimDate].[FullDateAlternateKey]

Trying to plunge deeper, I looked at the trees of physical operators presented above plans.

Below are the most important parts of the trace of undocumented flags 8607 and 8612.

For plan (2):

PhyOp_Apply lookup TBL: AdventureWorksDW2017.dbo.DimProduct
…
PhyOp_Range TBL: AdventureWorksDW2017.dbo.DimProduct(alias TBL: p)(6) ASC  Bmk ( QCOL: [p].ProductKey) IsRow: COL: IsBaseRow1002  [ Card=296.839 Cost(RowGoal 0,ReW 0,ReB 327.68,Dist 328.68,Total 328.68)= 0.174387 ](Distance = 2)
              ScaOp_Comp x_cmpLe
                 ScaOp_Identifier QCOL: [p].StartDate
                 ScaOp_Identifier QCOL: [AdventureWorksDW2017].[dbo].[DimDate].FullDateAlternateKey

For plan (1):

PhyOp_Apply (x_jtInner)
…
PhyOp_Range TBL: AdventureWorksDW2017.dbo.DimProduct(alias TBL: p)(6) ASC  Bmk ( QCOL: [p].ProductKey) IsRow: COL: IsBaseRow1002  [ Card=181.8 Cost(RowGoal 0,ReW 0,ReB 327.68,Dist 328.68,Total 328.68)= 0.132795 ](Distance = 2)


                 ScaOp_Comp x_cmpLe

                    ScaOp_Identifier QCOL: [p].StartDate

                    ScaOp_Identifier QCOL: [AdventureWorksDW2017].[dbo].[DimDate].FullDateAlternateKey

As you can see, the optimizer selects various implementations of the Apply operator, PhyOp_Apply lookup in (2) and PhyOp_Apply (x_jtInner) in (1), but I still do not understand what I can extract from this.

I can get the same estimate as in plan (1) by rewriting the original query without left outer join as follows:

DECLARE @db DATE = '20130720'
DECLARE @de DATE = '20130802'

;WITH Dates AS(

    SELECT [FullDateAlternateKey] AS d
    FROM [AdventureWorksDW2017].[dbo].[DimDate]
    WHERE [FullDateAlternateKey] BETWEEN @db AND @de
)
SELECT *
FROM Dates
CROSS APPLY(

    SELECT TOP(1000000000)
        p.[ProductAlternateKey]
    FROM [AdventureWorksDW2017].[dbo].[DimProduct] p
    WHERE p.StartDate <= Dates.d
      AND ISNULL(p.EndDate, '99991231') >= Dates.d

) t
OPTION(QUERYTRACEON 9481 /*force legacy CE*/)

Which gives the following plan: (4)

As you can see, the estimation of the area highlighted in red coincides with the plan (1) and the PhyOp_Apply (x_jtInner) operator in the tree of physical operators.

Please help me answer the question, is there a way to influence such an estimation of cardinality, possibly by hints or by changing the query form, etc., and help to understand why the optimizer gives such an estimation in this case.

Best Answer

There are often several ways to derive a cardinality estimate, with each method giving a different (but equally valid) answer. That is simply the nature of statistics and estimations.

You ask essentially why one method produces an estimate of 296.839 rows, while another gives 181.8 rows.

Let's look at a simpler example of the same AdventureWorksDW2017 join as given in the question:

Example 1 - Join

DECLARE @db date = '20130720';
DECLARE @de date = '20130802';

SELECT DD.FullDateAlternateKey, DP.ProductAlternateKey
FROM dbo.DimDate AS DD
JOIN dbo.DimProduct AS DP
    ON DP.StartDate <= CONVERT(datetime, DD.FullDateAlternateKey)
WHERE
    DD.FullDateAlternateKey BETWEEN @db AND @de
OPTION (FORCE ORDER, USE HINT ('FORCE_LEGACY_CARDINALITY_ESTIMATION'));

This is a join between:

DimDate (filtered on FullDateAlternateKey BETWEEN @db AND @de); and
DimProduct

with the join predicate being:

DP.StartDate <= CONVERT(datetime, DD.FullDateAlternateKey)

One way to compute the selectivity of the join is to consider how FullDateAlternateKey values will overlap with StartDate values using histogram information.

The histogram steps of FullDateAlternateKey will be scaled for the selectivity of BETWEEN @db AND @de, before being compared with DP.StartDate to see how they join.

Using the original CE, the join estimation will align the two histograms step by step using linear interpolation before being 'joined'.

Once we have computed the selectivity of the join using this method, it doesn't matter (except for display purposes) whether the join is a hash, merge, nested loops, or apply.

The steps of the histogram-based calculation aren't particularly difficult, but they are too long-winded to show here. So I will cut to the chase and simply show the outcome:

Notice the estimate of 296.839 rows on the DimProduct seek.

This is a consequence of the join cardinality estimate being computed as 97,565.2 rows (using histograms). The filter on DimDate passes through 328.68 rows, so the inner side must produce 296.839 rows per iteration on average to make the maths work out.

If a hash or merge join were possible for this query (which it isn't, due to the inequality), the DimProduct table would be scanned, producing all of its 606 rows. The result of the join would still be 97,565.2 rows.

This estimate is a consequence of estimating as a join.

Example 2 - Apply

We could also estimate this query as an apply. A logically-equivalent form written in T-SQL is:

DECLARE @db date = '20130720';
DECLARE @de date = '20130802';

SELECT DD.FullDateAlternateKey, DP.ProductAlternateKey
FROM dbo.DimDate AS DD
CROSS APPLY
(
    SELECT DP.ProductAlternateKey
    FROM dbo.DimProduct AS DP
    WHERE
        DP.StartDate <= CONVERT(datetime, DD.FullDateAlternateKey)
) AS DP
WHERE
    DD.FullDateAlternateKey BETWEEN @db AND @de
OPTION (FORCE ORDER, USE HINT ('FORCE_LEGACY_CARDINALITY_ESTIMATION'), QUERYTRACEON 9114);

(trace flag 9114 prevents the optimizer rewriting the apply as a join)

The estimation approach this time is to assess how many rows will match in DimProduct for each row from DimDate (per iteration):

We have 328.68 rows from DimDate as before, but now each of those rows is expected to match 181.8 rows in DimProduct.

This is simply a guess at the selectivity of StartDate <= FullDateAlternateKey.

The guess is 30% of the 606 rows in DimProduct: 0.3 * 606 = 181.8 rows.

This estimate is a consequence of estimating as an apply.

Final notes

Your example introduces an outer join as a way to make the query too complex for the optimizer to transform from apply to join form. Using TOP inside the apply is another way to convince the optimizer not to translate an apply to join (even when it could).

Related Solutions

Sql-server – Outer Apply vs Left Join Performance

Can anyone tell how exactly apply works and how will it effect the performance in very large data

APPLY is a correlated join (called a LATERAL JOIN in some products and newer versions of the SQL Standard). Like any logical construction, it has no direct impact on performance. In principle, we should be able to write a query using any logically equivalent syntax, and the optimizer would transform our input into exactly the same physical execution plan.

Of course, this would require the optimizer to know every possible transformation, and to have the time to consider each one. This process might well take longer than the current age of the universe, so most commercial products do not take this approach. Therefore, query syntax can, and often does, have an impact on final performance, though it is difficult to make general statements about which is better and why.

The specific form of OUTER APPLY ( SELECT TOP ... ) is most likely to result in a correlated nested loops join in current versions of SQL Server, because the optimizer does not contain logic to transform this pattern to an equivalent JOIN. Correlated nested loops join may not perform well if the outer input is large, and the inner input is unindexed, or the pages needed are not already in memory. In addition, specific elements of the optimizer's cost model mean a correlated nested loops join is less likely than a semantically-identical JOIN to produce a parallel execution plan.

I was able to make same query with single left join and row_number()

This may or may not be better in the general case. You will need to performance test both alternatives with representative data. The LEFT JOIN and ROW_NUMBER certainly has potential to be more efficient, but it depends on the precise query plan shape chosen. The primary factors that affect the efficiency of this approach is the availability of an index to cover the columns needed, and to supply the order needed by the PARTITION BY and ORDER BY clauses. A second factor is the size of the table. An efficient and well-indexed APPLY can out-perform a ROW_NUMBER with optimal indexing if the query touches a relatively small portion of the table concerned. Testing is needed.

Sql-server – Converting outer apply with Correlated subquery in SQL Server to SQLite

SQLite does not support lateral or correlated joins. You can work around that using a join condition. For example:

create table t1 (id int, name text);
create table t2 (id int, t1id int references t1(id), name text);
insert into t1 values (1, 'a'), (2, 'b'), (3, 'c');
insert into t2 values (1, 1, 'a1'), (2, 1, 'a2'), (3, 2, 'b1');

Now we'd like to look up at most one row from t2, sorted by name. Here's one approach that will work in SQLite:

select  *
from    t1
left join
        t2
on      t2.id =
        (
        select  id
        from    t2
        where   t1.id = t2.t1id
        order by
                name desc
        limit   1
        )

Example at SQL Fiddle.