I had a hard time coming up with an appropriate title and it might be hard for me to explain the issue, but I'll try to be as clear as possible.
I have a dynamic query that is pulling assets based on their asset title or one of many "identifiers" that are attached to that asset. Every asset as a "snapshot" ID that is tied to a asset identifier snapshot table, that is then tied to a version table that holds the actual value. That relationship looks like this:
SELECT
*
FROM
vwAsset AS Asset
INNER JOIN dbo.tblAssetIdentifierSnapshot ON tblAssetIdentifierSnapshot.AssetSnapshotId = Asset.AssetSnapshotId
INNER JOIN dbo.tblAssetIdentifierVersion AS AssetIdentifierVersion ON AssetIdentifierVersion.AssetIdentifierVersionId = tblAssetIdentifierSnapshot.AssetIdentifierVersionId
The value in the tblAssetIdentifierVersion
table is what I'm trying to get at. Here is what the full query looks like:
And here's a pastebin.
If I run this query without the two highlighted joins and remove the OR
, it runs instantly. But adding these joins makes the query run for 25+ seconds.
When I run this through Sentry Plan Explorer, it shows that some 274K+ rows are being searched for each join, before it finally filters down to the 137 it returns.
This is the query columns that sentry plan explorer shows:
I'm not an expert, so take this with a grain of salt, but every looks mostly okay. I don't think anything can be done about index scans, right?
Is there any way to filter down the left joins more without knowing the snapshot ID or the versions that are tied to the asset?
Here's the execution plan: https://www.brentozar.com/pastetheplan/?id=HJtsYtGLe
Best Answer
It looks like this query is an example of a problematic use of OR according to SQLCAT:
In the query plan with the additional tables filtered out, SQL Server is able to filter out irrelevant rows quickly. Also the filter condition can be applied directly to the table.
In the query plan with the
OR
clause, SQL Server filters out unnecessary rows only at the very end of the plan, after doing millions of unnecessary index seeks. This is the query that you reported as being slow in your question.In the query plan with the query that I suggested (
UNION
instead ofOR
), SQL Server is more able to directly apply the filters and less overall work is done. There is not a filtering step at the end that eliminates hundreds of thousands of rows.Note that this approach may not work as well as it usually does because you're querying a view and the logic behind that view may be expensive. It is also true that rewriting the
OR
as aUNION
will result in duplicated code. The best way around this is to use a derived table. Unfortunately, this may not be a good option for you depending on the possible filter conditions that end users can add. If you can use a derived table, one rewrite is as follows:If you can't use a derived table and need to duplicate the full query logic that's okay. To get the TOP behavior that you're looking for you mean need to define the whole query as a CTE and do the TOP filtering as a final step.
As a final note, carefully consider if
DISTINCT TOP 250
without anORDER BY
clause is what you want. With that query you will just get an arbitrary 250 rows. If you are returning data to an end user there may be a more meaningful way to order or limit the results.