Sql-server – Why does changing the declared join column order introduce a sort

join;sort-operatorsql serversql server 2014sql-server-2017

I have two tables with identically named, typed, and indexed key columns. One of the them has a unique clustered index, the other one has a non-unique.

The test setup

Setup script, including some realistic statistics:

DROP TABLE IF EXISTS #left;
DROP TABLE IF EXISTS #right;

CREATE TABLE #left (
    a       char(4) NOT NULL,
    b       char(2) NOT NULL,
    c       varchar(13) NOT NULL,
    d       bit NOT NULL,
    e       char(4) NOT NULL,
    f       char(25) NULL,
    g       char(25) NOT NULL,
    h       char(25) NULL
    --- and a few other columns
);

CREATE UNIQUE CLUSTERED INDEX IX ON #left (a, b, c, d, e, f, g, h)

UPDATE STATISTICS #left WITH ROWCOUNT=63800000, PAGECOUNT=186000;

CREATE TABLE #right (
    a       char(4) NOT NULL,
    b       char(2) NOT NULL,
    c       varchar(13) NOT NULL,
    d       bit NOT NULL,
    e       char(4) NOT NULL,
    f       char(25) NULL,
    g       char(25) NOT NULL,
    h       char(25) NULL
    --- and a few other columns
);

CREATE CLUSTERED INDEX IX ON #right (a, b, c, d, e, f, g, h)

UPDATE STATISTICS #right WITH ROWCOUNT=55700000, PAGECOUNT=128000;

The repro

When I join these two tables on their clustering keys, I expect a one-to-many MERGE join, like so:

SELECT *
FROM #left AS l
LEFT JOIN #right AS r ON
    l.a=r.a AND
    l.b=r.b AND
    l.c=r.c AND
    l.d=r.d AND
    l.e=r.e AND
    l.f=r.f AND
    l.g=r.g AND
    l.h=r.h
WHERE l.a='2018';

This is the query plan I want:

(Never mind the warnings, they have to do with the fake statistics.)

However, if I change the order of the columns around in the join, like so:

SELECT *
FROM #left AS l
LEFT JOIN #right AS r ON
    l.c=r.c AND     -- used to be third
    l.a=r.a AND     -- used to be first
    l.b=r.b AND     -- used to be second
    l.d=r.d AND
    l.e=r.e AND
    l.f=r.f AND
    l.g=r.g AND
    l.h=r.h
WHERE l.a='2018';

… this happens:

The Sort operator seems to order the streams according to the declared order of the join, i.e. c, a, b, d, e, f, g, h, which adds a blocking operation to my query plan.

Things I've looked at

I've tried changing the columns to NOT NULL, same results.
The original table was created with ANSI_PADDING OFF, but creating it with ANSI_PADDING ON does not affect this plan.
I tried an INNER JOIN instead of LEFT JOIN, no change.
I discovered it on a 2014 SP2 Enterprise, created a repro on a 2017 Developer (current CU).
Removing the WHERE clause on the leading index column does generate the good plan, but it kind of affects the results.. 🙂

Finally, we get to the question

Is this intentional?
Can I eliminate the sort without changing the query (which is vendor code, so I'd really rather not…). I can change the table and indexes.

Best Answer

Is this intentional?

It is by design, yes. The best public source for this assertion was unfortunately lost when Microsoft retired the Connect feedback site, obliterating many useful comments from developers on the SQL Server team.

Anyway, the current optimizer design does not actively seek to avoid unnecessary sorts per se. This is most often encountered with windowing functions and the like, but can also be seen with other operators that are sensitive to ordering, and in particular to preserved ordering between operators.

Nevertheless, the optimizer is quite good (in many cases) at avoiding unnecessary sorting, but this outcome normally occurs for reasons other than aggressively trying different ordering combinations. In that sense, it is not so much a question of 'search space' as it is of the complex interactions between orthogonal optimizer features that have been shown to increase general plan quality at acceptable cost.

For example, sorting can often be avoided simply by matching an ordering requirement (e.g. top-level ORDER BY) to an existing index. Trivially in your case that could mean adding ORDER BY l.a, l.b, l.c, l.d, l.e, l.f, l.g, l.h; but this is an over-simplification (and unacceptable because you do not want to change the query).

More generally, each memo group may be associated with required or desired properties, which may include input ordering. When there is no obvious reason to enforce a particular order (e.g. to satisfy an ORDER BY, or to ensure correct results from an order-sensitive physical operator), there is an element of 'luck' involved. I wrote more about the specifics of that as it pertains to merge join (in union or join mode) in Avoiding Sorts with Merge Join Concatenation. Much of that goes beyond the supported surface area of the product, so treat it as informational, and subject to change.

In your particular case, yes, you may adjust the indexing as jadarnel27 suggests to avoid the sorts; though there is little reason to actually prefer a merge join here. You could also hint a choice between hash or loop physical join with OPTION(HASH JOIN, LOOP JOIN) using a Plan Guide without changing the query, depending on your knowledge of the data, and the trade-off between best, worst, and average-case performance.

Finally, as a curiosity, note that the sorts can be avoided with a simple ORDER BY l.b, at the cost of a potentially less efficient many-to-many merge join on b alone, with a complex residual. I mention this mostly as an illustration of the interaction between optimizer features I mentioned previously, and the way top-level requirements can propagate.

Related Solutions

Sql-server – How to reset statistics after UPDATE STATISTICS … WITH ROWCOUNT

Use DBCC UPDATEUSAGE with the COUNT_ROWS option.

DBCC UPDATEUSAGE 
(   { database_name | database_id | 0 } 
    [ , { table_name | table_id | view_name | view_id } 
    [ , { index_name | index_id } ] ] 
) [ WITH [ NO_INFOMSGS ] [ , ] [ COUNT_ROWS ] ]

Documentation

Sql-server – Is the WHERE-JOIN-ORDER-(SELECT) rule for index column order wrong

Is the WHERE-JOIN-ORDER-(SELECT) rule for index column order wrong?

At the least it is incomplete and potentially misleading advice (I didn't bother to read the whole article). If you're going to read stuff on the Internet (including this), you should adjust your amount of trust according to how well you already know and trust the author, but always then verify for yourself.

There are a number of "rules of thumb" for creating indexes, depending on the exact scenario, but none are really a good substitute for understanding the core issues for yourself. Read up on the implementation of indexes and execution plan operators in SQL Server, go through some exercises, and come to a good solid understanding of how indexes can be used to make execution plans more efficient. There is no effective shortcut to attaining this knowledge and experience.

In general, I can say that your indexes should most often have columns used for equality tests first, with any inequalities last, and/or provided by a filter on the index. This is not a complete statement, because indexes can also provide order, which may be more useful than seeking directly to one or more keys in some situations. For example, ordering can be used to avoid a sort, to reduce the cost of a physical join option like merge join, to enable a stream aggregate, find the first few qualifying rows quickly...and so on.

I'm being a little vague here, because selecting the ideal index(es) for a query depends on so many factors - this is a very broad topic.

Anyway, it is not unusual to find conflicting signals for the 'best' indexes in a query. For example, your join predicate would like rows ordered one way for a merge join, the group by would like rows sorted another way for a stream aggregate, and finding the qualifying rows using the where clause predicates would suggest other indexes.

The reason indexing is an art as well as science is that an ideal combination is not always logically possible. Choosing the best compromise indexes for the workload (not just a single query) requires analytic skills, experience, and system-specific knowledge. If it were easy, the automated tools would be perfect, and performance-tuning consultants would be much less in demand.

As far as missing index suggestions are concerned: these are opportunistic. The optimizer brings them to your attention when it tries to match predicates and required sort order to an index that does not exist. The suggestions are therefore based on particular matching attempts in the specific context of the particular sub-plan variation it was considering at the time.

In context, the suggestions always make sense, in terms of reducing the estimated cost of data access, according to the optimizer's model. It does not do a wider analysis of the query as a whole (much less the wider workload), so you should think of these suggestions as a gentle hint that a skilled person needs to look at the available indexes, with the suggestions as a starting point (and usually no more than that).

In your case, the (Status) INCLUDE (ID) suggestion probably came about when it was looking at the possibility of a hash or merge join (example later). In that narrow context, the suggestion makes sense. For the query as a whole, maybe not. The index (ID, Status) enables a nested loop join with ID as an outer reference: equality seek on ID and inequality on Status per iteration.

One possible selection of indexes is:

CREATE INDEX i1 ON dbo.I (ID, [Status]);
CREATE INDEX i1 ON dbo.IP (Deleted, OPID, IID) INCLUDE (Q);

...which produces a plan like:

I am not saying these indexes are optimal for you; they happen to work to produce a reasonable-looking plan to me, without being able to see statistics for the tables involved, or the full definitions and existing indexing. Also, I know nothing of the wider workload or real query.

Alternatively (just to show one of the myriad additional possibilities):

CREATE INDEX i1 ON dbo.I ([Status]) INCLUDE (ID);
CREATE INDEX i1 ON dbo.IP (Deleted, IID, OPID) INCLUDE (Q);

Gives:

Execution plans were generated using SQL Sentry Plan Explorer.

Best Answer

Related Solutions

Sql-server – How to reset statistics after UPDATE STATISTICS … WITH ROWCOUNT

Sql-server – Is the WHERE-JOIN-ORDER-(SELECT) rule for index column order wrong

Related Question