SQL Server – Difference Between Hash, Merge, and Loop Join

sql server

In SQL Server you can specify the join hints:

HASH JOIN
MERGE JOIN
LOOP JOIN

What is the definition of those three join hints, and when should each be used?

Best Answer

From MSDN, in the topic of Advanced Query Tuning Concepts:

SQL Server employs three types of join operations:

Nested loops joins

Merge joins

Hash joins

If one join input is small (fewer than 10 rows) and the other join input is fairly large and indexed on its join columns, an index nested loops join is the fastest join operation because they require the least I/O and the fewest comparisons. For more information about nested loops, see Understanding Nested Loops Joins.

If the two join inputs are not small but are sorted on their join column (for example, if they were obtained by scanning sorted indexes), a merge join is the fastest join operation. If both join inputs are large and the two inputs are of similar sizes, a merge join with prior sorting and a hash join offer similar performance. However, hash join operations are often much faster if the two input sizes differ significantly from each other. For more information, see Understanding Merge Joins.

Hash joins can efficiently process large, unsorted, non indexed inputs.

But I believe that you should start with a more basic topic: Query Tuning and lastly go to using the query hints.

Related Solutions

Sql-server – SQL Server does not optimize parallel merge join on two equivalently partitioned tables

You're right that the SQL Server optimizer prefers not to generate parallel MERGE join plans (it costs this alternative quite high). Parallel MERGE always requires repartitioning exchanges on both join inputs, and more importantly, it requires that row order be preserved across those exchanges.

Parallelism is most efficient when each thread can run independently; order preservation often leads to frequent synchronization waits, and may ultimately cause exchanges to spill to tempdb to resolve an intra-query deadlock condition.

These problems can be circumvented by running multiple instances of the whole query on one thread each, with each thread processing an exclusive range of data. This is not a strategy that the optimizer considers natively, however. As it is, the original SQL Server model for parallelism breaks the query at exchanges, and runs the plan segments formed by those splits on multiple threads.

There are ways to achieve running whole query plans on multiple threads over exclusive data set ranges, but they require trickery that not everyone will be happy with (and will not be supported by Microsoft or guaranteed to work in the future). One such approach is to iterate over the partitions of a partitioned table and give each thread the task of producing a subtotal. The result is the SUM of the row counts returned by each independent thread:

Obtaining partition numbers is easy enough from metadata:

DECLARE @P AS TABLE
(
    partition_number integer PRIMARY KEY
);

INSERT @P (partition_number)
SELECT
    p.partition_number
FROM sys.partitions AS p 
WHERE 
    p.[object_id] = OBJECT_ID(N'test_transaction_properties', N'U')
    AND p.index_id = 1;

We then use these numbers to drive a correlated join (APPLY), and the $PARTITION function to limit each thread to the current partition number:

SELECT
    row_count = SUM(Subtotals.cnt)
FROM @P AS p
CROSS APPLY
(
    SELECT
        cnt = COUNT_BIG(*)
    FROM dbo.test_transaction_item_detail AS i
    JOIN dbo.test_transaction_properties AS t ON
        t.transactionID = i.transactionID
    WHERE 
        $PARTITION.pf_test_transactionId(t.transactionID) = p.partition_number
        AND $PARTITION.pf_test_transactionId(i.transactionID) = p.partition_number
) AS SubTotals;

The query plan shows a MERGE join being performed for each row in table @P. The clustered index scan properties confirm that only a single partition is processed on each iteration:

Apply serial plan

Unfortunately, this only results in sequential serial processing of partitions. On the data set you provided, my 4-core (hyperthreaded to 8) laptop returns the correct result in 7 seconds with all data in memory.

To get the MERGE sub-plans to run concurrently, we need a parallel plan where partition ids are distributed over the available threads (MAXDOP) and each MERGE sub-plan runs on a single thread using the data in one partition. Unfortunately, the optimizer frequently decides against parallel MERGE on cost grounds, and there is no documented way to force a parallel plan. There is an undocumented (and unsupported) way, using trace flag 8649:

SELECT
    row_count = SUM(Subtotals.cnt)
FROM @P AS p
CROSS APPLY
(
    SELECT
        cnt = COUNT_BIG(*)
    FROM dbo.test_transaction_item_detail AS i
    JOIN dbo.test_transaction_properties AS t ON
        t.transactionID = i.transactionID
    WHERE 
        $PARTITION.pf_test_transactionId(t.transactionID) = p.partition_number
        AND $PARTITION.pf_test_transactionId(i.transactionID) = p.partition_number
) AS SubTotals
OPTION (QUERYTRACEON 8649);

Now the query plan shows partition numbers from @P being distributed among threads on a round-robin basis. Each thread runs the inner side of the nested loops join for a single partition, achieving our goal of processing disjoint data concurrently. The same result is now returned in 3 seconds on my 8 hyper-cores, with all eight at 100% utilization.

Parallel APPLY

I am not recommending you use this technique necessarily - see my earlier warnings - but it does address your question.

See my article Improving Partitioned Table Join Performance for more details.

Columnstore

Seeing as you are using SQL Server 2012 (and assuming it is Enterprise) you also have the option of using a columnstore index. This shows the potential of batch mode hash joins where sufficient memory is available:

CREATE NONCLUSTERED COLUMNSTORE INDEX cs 
ON dbo.test_transaction_properties (transactionID);

CREATE NONCLUSTERED COLUMNSTORE INDEX cs 
ON dbo.test_transaction_item_detail (transactionID);

With these indexes in place the query...

SELECT
    COUNT_BIG(*)
FROM dbo.test_transaction_properties AS ttp
JOIN dbo.test_transaction_item_detail AS ttid ON
    ttid.transactionID = ttp.transactionID;

...results in the following execution plan from the optimizer without any trickery:

Columnstore plan 1

Correct results in 2 seconds, but eliminating the row-mode processing for the scalar aggregate helps even more:

SELECT
    COUNT_BIG(*)
FROM dbo.test_transaction_properties AS ttp
JOIN dbo.test_transaction_item_detail AS ttid ON
    ttid.transactionID = ttp.transactionID
GROUP BY
    ttp.transactionID % 1;

Optimized columnstore

The optimized column-store query runs in 851ms.

Geoff Patterson created the bug report Partition Wise Joins but it was closed as Won't Fix.

SQL Server – Forcing a Merge Join Without Using Hints

To switch from a nested loop join to a merge join without a hint you need the plan with the merge join to have a lower estimated cost than the plan with the nested loop. Technically, you also need the query optimizer to find the lower cost plan during plan exploration but there isn't a lot that you can do about that.

First, a note about merge join costing. In my experience, SQL Server is pretty pessimistic about merge join costing. This seems understandable when you consider how much the IO requirements for a merge join can change when small amounts of data are changed. Consider a simple example in which one table has the integers from 1 - 10000 and the other has the integers from 100001 to 1100000:

DROP TABLE IF EXISTS X_SMALL_TABLE;

CREATE TABLE X_SMALL_TABLE (ID INT NOT NULL PRIMARY KEY (ID));

INSERT INTO X_SMALL_TABLE WITH (TABLOCK) (ID)
SELECT N
FROM dbo.GetNums(10000);

UPDATE STATISTICS X_SMALL_TABLE WITH FULLSCAN;

DROP TABLE IF EXISTS X_LARGE_TABLE;

CREATE TABLE X_LARGE_TABLE (ID INT NOT NULL PRIMARY KEY (ID));

INSERT INTO X_LARGE_TABLE WITH (TABLOCK) (ID)
SELECT N + 100000
FROM dbo.GetNums(1000000);

UPDATE STATISTICS X_LARGE_TABLE WITH FULLSCAN;

If I join the tables together I will get 0 rows returned. If I force a MERGE JOIN then SQL Server will scan all of the rows from the small table and only one of the rows from the large table. By default, SQL Server will traverse through the tables in ascending order according to the join key. However, the query optimizer estimates that all 1000000 rows will be scanned from the large table and subsequently assigns a relatively large cost to that operator:

As we can see from the actual plan, only one row was scanned. What happens if we insert just one new row into the small table?

INSERT INTO X_SMALL_TABLE
SELECT 1200000;

UPDATE STATISTICS X_SMALL_TABLE WITH FULLSCAN;

Since the merge join algorithm goes in ID order once we hit ID 10000 in the small table we should loop through all of the rows in the large table until we hit ID 1200000.

The actual plan confirms that all rows were read from both tables. So small changes in data can lead to large differences in merge join performance. Based on that it seems reasonable for SQL Server to be pessimistic about merge join costing, which will work against you if you are trying to get a merge join to appear in your query. In your second query you scan all of the rows from CustomerDiscountCards. That operation alone could have a higher estimated cost than the entire nested loop join plan.

When using hints to force a MERGE JOIN note that the MERGE JOIN join hint also adds a FORCE ORDER hint to the query. Based on the information in the question, your second query doesn't run quickly strictly because of the MERGE JOIN. It runs quickly because of the new join order. The optimizer joins together CustomerDiscountCards and DiscountCards using full scans on both tables and finds 0 matching rows. Any inner join against a 0 row result set is going to be fast because query execution can stop at that point (disregarding blocking operators which occur before that). So instead of focusing on forcing a merge join, which will probably be difficult without hints, I would focus on getting the join order correct.

A simple way to force join order is to materialize the first result set that you want into a temporary table. You could join CustomerDiscountCards and DiscountCards in an initial query and put the matching relevant rows into a temp table. This should run in under a second for your test case because you won't actually be inserting any rows into the temp table. If your second query joins to the empty table it should finish almost instantly. You will need to test the performance of the query for other parameter values in which you actually get rows back because you could be putting a fair number of rows and columns into tempdb.

You may be able to rewrite your SQL to effectively force the join order without using a temp table. Of course using the FORCE ORDER hint is the easiest way to do that but you said that you don't want to use hints and that hint can be a difficult one to use. It applies to the entire query so if your data ever changes you could end up with poor performance. For an idea on how to force join order without the hint, it looks like pairing together the tables that you want to be joined first in a derived table and adding a superfluous TOP operator seems to push SQL Server in the right direction:

INNER JOIN 
(
    SELECT TOP 9223372036854775807 ...
    FROM 
    CustomerDiscountCards
    INNER JOIN DiscountCards ON ...
) t ON ...

However, I cannot conclude if SQL Server is actually forced to evaluate that join separately from the rest of the query. As before you will need to test.

To give you a final option to consider, since the initial problem appears to relate to row goals you could hide the row goal from the optimizer. One way to accomplish that is to use the OPTIMIZE FOR query hint and to use a variable for your TOP statement. Consider the query snippet below:

DECLARE @TOP BIGINT = 10;

SELECT TOP (@top)
...
OPTION (OPTIMIZE FOR (@TOP = 9223372036854775807))

That query will only return the first 10 rows that it finds in the result set. However, the plan will be created as if it returns the first 9223372036854775807 rows. This may disfavor the nested loop join plan that you experienced the issues with, but it's hard to say how well this will perform in general.

Ultimately, it's hard to say more with the information that we have in the question. Presumably your query can return rows in some cases (there isn't much reason to continually run a query that always returnns no rows) so you'll need to test your options for all of the different cases.

Best Answer

Related Solutions

Sql-server – SQL Server does not optimize parallel merge join on two equivalently partitioned tables

Columnstore

SQL Server – Forcing a Merge Join Without Using Hints

Related Question