TL;DR
Since this question keeps getting views, I'll summarize it here so newcomers don't have to suffer the history:
JOIN table t ON t.member = @value1 OR t.member = @value2 -- this is slow as hell
JOIN table t ON t.member = COALESCE(@value1, @value2) -- this is blazing fast
-- Note that here if @value1 has a value, @value2 is NULL, and vice versa
I realize this might not be everyone's problem, but by highlighting the sensitivity of the ON clauses, it might help you look in the right direction. In any case the original text is here for future anthropologists:
Original text
Consider the following simple query (only 3 tables involved)
SELECT
l.sku_id AS ProductId,
l.is_primary AS IsPrimary,
v1.category_name AS Category1,
v2.category_name AS Category2,
v3.category_name AS Category3,
v4.category_name AS Category4,
v5.category_name AS Category5
FROM category c4
JOIN category_voc v4 ON v4.category_id = c4.category_id and v4.language_code = 'en'
JOIN category c3 ON c3.category_id = c4.parent_category_id
JOIN category_voc v3 ON v3.category_id = c3.category_id and v3.language_code = 'en'
JOIN category c2 ON c2.category_id = c3.category_id
JOIN category_voc v2 ON v2.category_id = c2.category_id and v2.language_code = 'en'
JOIN category c1 ON c1.category_id = c2.parent_category_id
JOIN category_voc v1 ON v1.category_id = c1.category_id and v1.language_code = 'en'
LEFT OUTER JOIN category c5 ON c5.parent_category_id = c4.category_id
LEFT OUTER JOIN category_voc v5 ON v5.category_id = c5.category_id and v5.language_code = @lang
JOIN category_link l on l.sku_id IN (SELECT value FROM #Ids) AND
(
l.category_id = c4.category_id OR
l.category_id = c5.category_id
)
WHERE c4.[level] = 4 AND c4.version_id = 5
This is a pretty simple query, the only confusing part is the last category join, it's this way because category level 5 might or might not exist. At the end of the query I am looking for category info per product ID (SKU ID), and the that's where the very large table category_link comes in. Finally, the table #Ids is just a temp table containing 10'000 Ids.
When executed, I get the following actual execution plan:
As you can see, almost 90% of the time is spent in the Nested Loops (Inner Join). Here's extra information on those Nested Loops:
Note that the table names don't match exactly because I edited the query table names for readability, but it's pretty easy to match (ads_alt_category = category). Is there any way to optimize this query? Also note that in production, the temp table #Ids doesn't exist, it's a Table Valued Parameter of the same 10'000 Ids passed on to the Stored Procedure.
Additional info:
- category indices on category_id and parent_category_id
- category_voc index on category_id, language_code
- category_link index on sku_id, category_id
Edit (solved)
As pointed out by the accepted answer, the problem was the OR clause in the category_link JOIN. However, the code suggested in the accepted answer is very slow, slower even than the original code. A much faster and also much cleaner solution is simply to replace the current JOIN condition with the following:
JOIN category_link l on l.sku_id IN (SELECT value FROM @p1) AND l.category_id = COALESCE(c5.category_id, c4.category_id)
This minute tweak is the fastest solution, tested against the double join from the accepted answer and also tested against the CROSS APPLY as suggested by valverij.
Best Answer
The problem appears to be in this part of the code:
or
in join conditions is always suspicious. One suggestion is to split this into two joins:You then have to modify the rest of the query to handle this . . .
coalesce(l1.sku_id, l2.sku_id)
for instance in theselect
clause.