Sql-server – Why does changing the declared join column order introduce a sort

join;sort-operatorsql serversql server 2014sql-server-2017

I have two tables with identically named, typed, and indexed key columns. One of the them has a unique clustered index, the other one has a non-unique.

The test setup

Setup script, including some realistic statistics:

DROP TABLE IF EXISTS #left;
DROP TABLE IF EXISTS #right;

CREATE TABLE #left (
    a       char(4) NOT NULL,
    b       char(2) NOT NULL,
    c       varchar(13) NOT NULL,
    d       bit NOT NULL,
    e       char(4) NOT NULL,
    f       char(25) NULL,
    g       char(25) NOT NULL,
    h       char(25) NULL
    --- and a few other columns
);

CREATE UNIQUE CLUSTERED INDEX IX ON #left (a, b, c, d, e, f, g, h)

UPDATE STATISTICS #left WITH ROWCOUNT=63800000, PAGECOUNT=186000;

CREATE TABLE #right (
    a       char(4) NOT NULL,
    b       char(2) NOT NULL,
    c       varchar(13) NOT NULL,
    d       bit NOT NULL,
    e       char(4) NOT NULL,
    f       char(25) NULL,
    g       char(25) NOT NULL,
    h       char(25) NULL
    --- and a few other columns
);

CREATE CLUSTERED INDEX IX ON #right (a, b, c, d, e, f, g, h)

UPDATE STATISTICS #right WITH ROWCOUNT=55700000, PAGECOUNT=128000;

The repro

When I join these two tables on their clustering keys, I expect a one-to-many MERGE join, like so:

SELECT *
FROM #left AS l
LEFT JOIN #right AS r ON
    l.a=r.a AND
    l.b=r.b AND
    l.c=r.c AND
    l.d=r.d AND
    l.e=r.e AND
    l.f=r.f AND
    l.g=r.g AND
    l.h=r.h
WHERE l.a='2018';

This is the query plan I want:

This is what I want.

(Never mind the warnings, they have to do with the fake statistics.)

However, if I change the order of the columns around in the join, like so:

SELECT *
FROM #left AS l
LEFT JOIN #right AS r ON
    l.c=r.c AND     -- used to be third
    l.a=r.a AND     -- used to be first
    l.b=r.b AND     -- used to be second
    l.d=r.d AND
    l.e=r.e AND
    l.f=r.f AND
    l.g=r.g AND
    l.h=r.h
WHERE l.a='2018';

… this happens:

The query plan after changing the declared column order in the join.

The Sort operator seems to order the streams according to the declared order of the join, i.e. c, a, b, d, e, f, g, h, which adds a blocking operation to my query plan.

Things I've looked at

  • I've tried changing the columns to NOT NULL, same results.
  • The original table was created with ANSI_PADDING OFF, but creating it with ANSI_PADDING ON does not affect this plan.
  • I tried an INNER JOIN instead of LEFT JOIN, no change.
  • I discovered it on a 2014 SP2 Enterprise, created a repro on a 2017 Developer (current CU).
  • Removing the WHERE clause on the leading index column does generate the good plan, but it kind of affects the results.. 🙂

Finally, we get to the question

  • Is this intentional?
  • Can I eliminate the sort without changing the query (which is vendor code, so I'd really rather not…). I can change the table and indexes.

Best Answer

Is this intentional?

It is by design, yes. The best public source for this assertion was unfortunately lost when Microsoft retired the Connect feedback site, obliterating many useful comments from developers on the SQL Server team.

Anyway, the current optimizer design does not actively seek to avoid unnecessary sorts per se. This is most often encountered with windowing functions and the like, but can also be seen with other operators that are sensitive to ordering, and in particular to preserved ordering between operators.

Nevertheless, the optimizer is quite good (in many cases) at avoiding unnecessary sorting, but this outcome normally occurs for reasons other than aggressively trying different ordering combinations. In that sense, it is not so much a question of 'search space' as it is of the complex interactions between orthogonal optimizer features that have been shown to increase general plan quality at acceptable cost.

For example, sorting can often be avoided simply by matching an ordering requirement (e.g. top-level ORDER BY) to an existing index. Trivially in your case that could mean adding ORDER BY l.a, l.b, l.c, l.d, l.e, l.f, l.g, l.h; but this is an over-simplification (and unacceptable because you do not want to change the query).

More generally, each memo group may be associated with required or desired properties, which may include input ordering. When there is no obvious reason to enforce a particular order (e.g. to satisfy an ORDER BY, or to ensure correct results from an order-sensitive physical operator), there is an element of 'luck' involved. I wrote more about the specifics of that as it pertains to merge join (in union or join mode) in Avoiding Sorts with Merge Join Concatenation. Much of that goes beyond the supported surface area of the product, so treat it as informational, and subject to change.

In your particular case, yes, you may adjust the indexing as jadarnel27 suggests to avoid the sorts; though there is little reason to actually prefer a merge join here. You could also hint a choice between hash or loop physical join with OPTION(HASH JOIN, LOOP JOIN) using a Plan Guide without changing the query, depending on your knowledge of the data, and the trade-off between best, worst, and average-case performance.

Finally, as a curiosity, note that the sorts can be avoided with a simple ORDER BY l.b, at the cost of a potentially less efficient many-to-many merge join on b alone, with a complex residual. I mention this mostly as an illustration of the interaction between optimizer features I mentioned previously, and the way top-level requirements can propagate.