I have a query like this:
SELECT ordIdent.Identifier, ord.OrderId
FROM OrderIdentifier ordIdent
JOIN [order] ord
ON ordIdent.OrderId = ord.OrderId
WHERE ordIdent.Identifier = '29584'
OR ord.ClientOrderId = '29584'
The idea is that we are searching using the value '29584' and we don't really know if it is a ClientOrderId
or an Identifier
, so we want to search both.
This gives really really bad performance. It pulls millions of rows (125 thousand from one and 20 Million from the other)
But this query:
SELECT ordIdent.Identifier, ord.OrderId
FROM OrderIdentifier ordIdent
JOIN [order] ord
ON ordIdent.OrderId = ord.OrderId
WHERE ord.ClientOrderId = '01193504'
UNION
SELECT ordIdent.Identifier, ord.OrderId
FROM OrderIdentifier ordIdent
JOIN [order] ord
ON ordIdent.OrderId = ord.OrderId
WHERE ordIdent.Identifier = '01193504'
has good performance. Normally I would just do this and call it a day.
But my actual query is very large, and I would rather not call it twice (though it is still better than using the "or".) So before I just set it up to be called twice, I thought I would ask around and see if I have other options.
It seems I should be to do some kind of left join to make this work. But I can't seem to wrap my head around it.
Is there a way to do this in one query that has a left join instead of an "OR"? (Or some other kind of option?)
UPDATE:
OK, I got thinking about it and I realized I could just factor out the OR into a CTE that uses the union
and then use the CTE the actual query.
I will leave this open in case there is a better way to deal with it, but a CTE fixes my issue.
Best Answer
I think that you can rewrite that query to use a
FULL OUTER JOIN
:But why would you want to? You'll end up reading all rows from both tables with that syntax which causes the performance problem that you're trying to avoid.
UNION
orUNION ALL
is the right way to go here. For filters like you have you can still useUNION ALL
like so:Sometimes there's a slight performance benefit in doing so. It depends on your data.