I'm trying to understand why a non-clustered index is not available for a given query when filtered. The relevant part of my (large) query is this:
) results
JOIN BE_Insurance ins
ON results.PayorId = ins.Id
Up in the select, I'm only grabbing ins.name. The index I initially created was this:
CREATE INDEX IX_BE_Insurance_PayorId_PayorName
ON BE_Insurance (Id) INCLUDE (Name)
WHERE ParentId IS NULL
My query is such that only payorIds with a NULL parentId will be chosen, but I understood the optimizer's disinclination to choose it. But when I added a hint to try to force the index, the whole thing error'd out.
Query processor could not produce a query plan because of the hints defined in this query. Resubmit the query without specifying any hints and without using SET FORCEPLAN.
I assumed it would follow my hint, and maybe error out at execution time if I wound up with some bad data and the index was missing some needed values. Removing the filter from the index caused it to be chosen successfully by the query (even without the hint).
Are indexes with a WHERE clause ineligible for queries like this? Are they eligible only if the optimizer can be guaranteed that the filter is valid and won't produce any missing values?
As requested, here's the whole query, which is still a work in progress.
SELECT results.*
FROM (
SELECT auths.*,
worked.WorkedHours,
worked.WorkedUnits,
worked.WorkedAmount,
worked.ActiveClients
FROM (
SELECT PayorId,
SUM(AuthHours) AuthHours,
SUM(AuthUnits) AuthUnits,
SUM(AuthAmount) AuthAmount
FROM (
--DYNAMIC TEMPLATE ----------------------------------------------------------------------------------
SELECT PayorId,
PayorName,
AuthHours AuthHours,
AuthUnits AuthUnits,
AuthAmount AuthAmount
FROM PayorAuthorizations_Level1Data_Authorizations auths WITH(NOEXPAND)
WHERE OrganizationId = @organizationId AND StartDate <= @endDate AND @startDate <= EndDate
--/DYNAMIC TEMPLATE----------------------------------------------------------------------------------
--INTERSECT / EXCEPT dynamically generated queries
) q
GROUP BY PayorId
) auths
JOIN (
SELECT PayorId,
SUM(ISNULL(TotalWorkedHours, 0)) WorkedHours,
SUM(ISNULL(TotalWorkedUnits, 0)) WorkedUnits,
SUM(ISNULL(TotalWorkedAmount, 0)) WorkedAmount,
COUNT(DISTINCT clientId) ActiveClients
FROM (
--DYNAMIC TEMPLATE ----------------------------------------------------------------------------------
SELECT PayorId,
TotalWorkedHours,
TotalWorkedUnits,
TotalWorkedAmount,
clientId clientId
FROM PayorAuthorizations_Level1Data_CurrentlyWorked worked WITH(NOEXPAND)
WHERE OrganizationId = @organizationId AND StartDate <= @endDate AND @startDate <= EndDate
--/DYNAMIC TEMPLATE----------------------------------------------------------------------------------
--INTERSECT / EXCEPT other dynamically generated queries
) q
GROUP BY PayorId
) worked
ON auths.payorId = worked.payorId
) results
JOIN BE_Insurance ins WITH (INDEX(IX_BE_Insurance_PayorId_PayorName))
ON results.PayorId = ins.Id
OPTION(FORCE ORDER, MERGE JOIN)
Best Answer
The query optimizer will only use a filtered index in a query plan if it can guarantee (within its reasoning framework) that all possible matches can be served from the index. This is by design, to avoid the sort of runtime error you describe.
This is a known current limitation. Adding the filtered column(s) to the key or include list is the standard workaround, and a current best practice for all sorts of semi-related reasons.
Be extremely careful using hints (directives) like this unless you fully understand all the consequences.
FORCE ORDER
in particular is an extremely powerful and wide-ranging hint, with a number of non-obvious side-effects including the placement of aggregate operators, and the order of evaluation of subqueries and common table expressions.For the most part, you should try to write queries that provide the query optimizer with enough good-quality information to make the right decisions without hints. The hinted plan may be 'optimal' today, but it may not remain so as the data volume and/or distribution changes over time.