I have two similar queries that generate the same query plan, except that one query plan executes a Clustered Index Scan 1316 times, while the other executes it 1 time.
The only difference between the two queries is different date criteria. The long running query actually narrower date criteria, and pulls back less data.
I have identified some indexes that will help with both queries, but I just want to understand why the Clustered Index Scan operator is executing 1316 times on a query that is virtually the same as the one that where it executes 1 time.
I checked the statistics on the PK that is being scanned, and they are relatively up to date.
Original query:
select distinct FIR_Incident.IncidentID
from FIR_Incident
left join (
select incident_id as exported_incident_id
from postnfirssummary
) exported_incidents on exported_incidents.exported_incident_id = fir_incident.incidentid
where FI_IncidentDate between '2011-06-01 00:00:00.000' and '2011-07-01 00:00:00.000'
and exported_incidents.exported_incident_id is not null
Generates this plan:
After narrowing the date range criteria:
select distinct FIR_Incident.IncidentID
from FIR_Incident
left join (
select incident_id as exported_incident_id
from postnfirssummary
) exported_incidents on exported_incidents.exported_incident_id = fir_incident.incidentid
where FI_IncidentDate between '2011-07-01 00:00:00.000' and '2011-07-02 00:00:00.000'
and exported_incidents.exported_incident_id is not null
Generates this plan:
Best Answer
The JOIN after the scan gives a clue: with less rows on one side of the last join (reading right to left of course) the optimiser chooses a "nested loop" not a "hash join".
However, before looking at this I'd aim to eliminate the Key Lookup and the DISTINCT.
Key lookup: your index on FIR_Incident should be covering, probably
(FI_IncidentDate, incidentid)
or the other way around. Or have both and see which is used more often (they both may be)The
DISTINCT
is a consequence of theLEFT JOIN ... IS NOT NULL
. The optimiser has already removed it (the plans have "left semi joins" on the final JOIN) but I'd use EXISTS for claritySomething like:
You can also use plan guides and JOIN hints to make SQL Server use a hash join, but try to make it work normally first: a guide or a hint probably won't stand the test of time because they are only useful for the data and queries you run now, not in the future