I have a query that joins to a table containing 2 million rows and need to filter on its geometry column by its geometry type (using geom.STGeometryType()). The query takes less than a second to run without this WHERE clause but over 40 seconds to run with the WHERE clause.
The execution plan seems to indicate that the filter on STGeometryType() is being run over the geometry column of the whole locations table and there's an accompanying warning that there are no statistics for the geometry column.
If I first select into a temporary table without the filter and then filter on the temporary table, I get the results in under a second. This is because it is only testing for geometry type over a result set of 25000 records. So one solution would be to do this (though it would require re-factoring the application).
I've also tried using a CTE and derived table to do all of the joins 'first' and then filter but that doesn't improve performance.
Is there a way I can structure the query so that this relatively expensive filter is applied last without the use of intermediate tables?
select o.code as Code, ro.ReportTaxonCode as TaxonCode, t1.TaxonName as TaxonName,
case
when o.NumberObserved IS NOT NULL THEN o.NumberObserved
when o.NumberObserved IS NULL then su.Mean
end as Scale,
ro.OriginalTaxonCode as RecordedCode, t2.TaxonName as RecordedName, s.code as SetCode, CAST(s.StartDate AS nvarchar), CAST(s.EndDate AS nvarchar), l.code as Location, l.geom
from obsveg.obs_observations o
inner join obsveg.obs_sets s
on o.obs_setsguid = s.guid
inner join obsveg.obs_locations l
on o.obs_locationsguid = l.guid
inner join obsveg.obs_rpt_reportobservations ro
on o.guid = ro.obs_observationsguid
inner join obsveg.obs_taxon t1
on ro.ReportTaxonCode = t1.code
inner join obsveg.obs_taxon t2
on ro.OriginalTaxonCode = t2.code
left join obsveg.OBS_MeasurementMethodScaleUnits mmsu
on o.OBS_MeasurementMethodScaleUnitsGUID = mmsu.GUID
left join obsveg.obs_ScaleUnits su
on mmsu.OBS_ScaleUnitsGUID = su.GUID
where ro.reportguid = '64c0a4af-ee4d-4e83-a194-2a14e8a6ab0e'
and l.geom.STGeometryType() = 'Point'
EDIT Warning in execution plan added
EDIT Execution plan posted to pastebin
Execution plan
Best Answer
The first thing to look at is your indexing strategy. Bad execution plans are often caused by insufficient indexes or stale statistics. Your statistics warning might hint at that.
If that does not resolve your problem there are a few hacks that you can try:
A top operation requires SQL Server to separate query sections in the execution plan.
If you know you are always dealing with less then 2 billion rows you could write your query like this:
An alternative is to write a scalar valued function that takes in the l.geom column and a few other columns from the other tables and returns the STGeometryType() value while ignoring all the other values. Because SQL Server does not consider the function logic at optimization time, it is forced to execute the function after the join. That does not guarantee that the other filter is executed first but often it works out that way.
The third option is to play around with join hints and join order. They often lead to a change in where filters are applied.
All three options I would consider a last resort because they make the code ugly and you run the risk that someone removes that ugliness later trying to make the code better.