Db2 – cross database way to force join strategies

db2join;optimization

[how to enforce the query to use certain join algorithm]

Knowing that there are several joining algorithms [Nested Loop; Sort Merge; Hash Join], when implementing a query , is there any technique to enforce the database to adapt certain Join algorithm ? Perhaps using some hints ? creating some indexes ?

Related Topic: Enforce Hash Join — not that useful for me.

Adding more details:
I am talking about DB2 V9.7, but I originally thought this would apply for other DB as well as they should share some common characteristic

Best Answer

First of all, note that not all databases support all three (Loop, Hash, Merge) types of join. For example, you cannot hash join in all version MySQL (thanks for the comment pointing out that this now works in MariaDb).

Second, databases will often have vendor specific syntax to force certain join strategies and orders. For example, SQL Server uses the keyword LOOP JOIN to force a loop, Oracle uses USE_NL, MySQL uses the syntax STRAIGHT_JOIN to force join order.

All that being said, there are often ways to write a query in such a way that it forces certain behaviours across database platforms. Here are the strategies I have found useful:

Forcing Loop Joins: Expressing a JOIN as a nested, correlated subquery instead of a regular join will typically force the loop strategy.

For example, instead of:

SELECT ...
FROM t1 JOIN t2 ON t1.key = t2.key

Write:

SELECT ... (SELECT ... FROM t2 WHERE t2.key = t1.key)
FROM t1 

Forcing Hash Join: Either remove the indexes on the small table or force a table scan of the table. As a generic method for disabling indexes, you can purposefully confuse the optimiser. For example, instead of:

SELECT ...
FROM t1
JOIN t2 ON t1.key = t2.key

Write (Assuming key is INT):

SELECT ...
FROM t1
JOIN t2 ON t1.key / 1 = t2.key / 1

This trick can be expressed more elegantly using functions like COALESCE And NVL (in Oracle)

Forcing Merge: I have not found a generic way to do this. But having the right indexes in place and then forcing the indexes does the trick in the cases I have seen.

Forcing Join Order: Join order is a little easier to force. The trick is to create artificial dependencies between the join conditions. For example, let us say you wanted to force this to be joined in the order t1, t2, t3:

SELECT ...
FROM t1 
JOIN t2 ON t2.k = t1.k
JOIN t3 ON t3.k = t2.key

You would write:

SELECT ...
FROM t1
JOIN t2 ON t2.k = t1.k / 1       /* For t2 to be looked up first */
JOIN t3 ON t3.k = t2+0*t1.k     /* Force t1 to be evaluated first */

For a general treatment of query optimisation across database platform, I would recommend picking up a copy of this: http://www.amazon.co.uk/SQL-Tuning-Dan-Tow-ebook/dp/B0026OR32Q/