SQL Server 2016 – How to Force SQL Server to Use Spatial Index Through a View

spatialsql serversql-server-2016

I have tables which holds transactions for properties that are stored as lat long pairs. (There are more columns and datapoints then in my example schema).

A common request is to find the transactions that happened within X miles of a particular point, and only retrieve the 5 most recent transactions that happened for each nearby property.

To make this work, I decided to add a view that encapsulates this logic of most recent:

create or alter view dbo.v_example
with schemabinding as
select example_id
      ,transaction_dt
      ,latitude
      ,longitude
      ,latlong
      ,most_recent= iif(row_number() over (partition by latitude,longitude order by transaction_dt desc) < 5,1,null)
from dbo.example;

So a query might look like this::

select *
from dbo.v_example
where latlong.STDistance(geography::Point(40,-74,4326)) <=1609.344e1
and most_recent = 1

Unfortunately, SQL Server does not want to use the spatial index when I query through the view. If I remove schemabinding and try to add a hint on the view, I get that the query processor cannot create a plan.

How can I encapsulate the logic and still get it to use my spatial index?

Here's a db<>fiddle with example data and plan shapes.

The table is much larger, and it is much slower to scan it then to do a clustered index seek and then find the near by point.

Best Answer

Window functions & Views

Recently I answered a different question regarding views & window functions but not all answers given there apply here.

Two differences are that this example uses geography datatypes and filters on the window function. The positive part here is that you are not bound to the window function and could use something like CROSS APPLY to get the closest results for one latitude & longitude combination.

The previous question + Answer

In short, the window function is calculated before the filtering is applied.

E.G.

where latlong.STDistance(geography::Point(40,-74,4326)) <=1609.344e1
and most_recent = 1

Extra information can be found in this blogpost from 2013 by Paul White on window functions and views.

Testing

As a first test, the result is clear when omitting the ROW_NUMBER() from the view

create or alter view dbo.v_example2
with schemabinding as
select example_id
      ,transaction_dt
      ,latitude
      ,longitude
      ,latlong
from dbo.example;

set statistics xml on;
select *
from dbo.v_example2
where latlong.STDistance(geography::Point(40,-74,4326)) <=1609.344e1

Resulting in the expected performant query plan.

DB<>Fiddle


But you obviously still want the additional filtering

Keeping the view + window function

You could choose to add the window function afterwards like so:

create or alter view dbo.v_example2
with schemabinding as
select example_id
      ,transaction_dt
      ,latitude
      ,longitude
      ,latlong
from dbo.example;

&

set statistics xml on;
;WITH CTE AS
(
select most_recent= iif(row_number() over (partition by latitude,longitude order by transaction_dt desc) < 5,1,null)
,*
from dbo.v_example2
where latlong.STDistance(geography::Point(40,-74,4326)) <=1609.344e1
)
SELECT
* 
FROM CTE
WHERE most_recent is not null;

Again resulting in the expected execution plan.

DB<>Fiddle


Using an inline table valued function + window function

CREATE FUNCTION dbo.F_Example
(
    @P1 INT

)  
RETURNS TABLE
WITH SCHEMABINDING AS
RETURN
    (
SELECT example_id
      ,transaction_dt
      ,latitude
      ,longitude
      ,latlong
      ,most_recent FROM
(
select example_id
      ,transaction_dt
      ,latitude
      ,longitude
      ,latlong
      ,most_recent= iif(row_number() over (partition by latitude,longitude order by transaction_dt desc) < 5,1,null)
from dbo.example
WHERE latlong.STDistance(geography::Point(40,-74,4326)) <= 1609.344e1
) AS A
WHERE most_recent= @P1
);

EXEC SP_EXECUTESQL
N'SELECT * FROM  dbo.F_Example(@P1)',N'@P1 INT',@P1 = 1

Resulting in the query plan you expected.

DB<>Fiddle

Using a CTE + window function

;WITH CTE AS
(
select example_id
      ,transaction_dt
      ,latitude
      ,longitude
      ,latlong
      ,most_recent= row_number() over (partition by latitude,longitude order by transaction_dt desc)
FROM dbo.example
WHERE latlong.STDistance(geography::Point(40,-74,4326)) <=1609.344e1
)

SELECT example_id
      ,transaction_dt
      ,latitude
      ,longitude
      ,latlong
      ,most_recent
FROM CTE2
WHERE most_recent < 5;

Using CROSS APPLY and TOP(4) without the window function

SELECT example_id
      ,a.transaction_dt
      ,latitude
      ,longitude
      ,latlong
from dbo.example e1
CROSS APPLY(
SELECT TOP(4) transaction_dt
FROM dbo.example e2
WHERE e1.latitude = e2.latitude  and e1.longitude = e2.longitude
GROUP BY latitude,longitude,transaction_dt
ORDER BY transaction_dt desc
) as a
WHERE latlong.STDistance(geography::Point(40,-74,4326)) <=1609.344e1
AND e1.transaction_dt = a.transaction_dt
ORDER BY transaction_dt desc;

I used top(4) since the resultset was based on < 5 not <= 5

When using the CROSS APPLY method there is one more index to be added, to remove the index spool & sort:

enter image description here

CREATE INDEX IX_latitude_longitude_transaction_dt
ON dbo.example(latitude,longitude,transaction_dt);

enter image description here

It goes without saying that you could add the CROSS APPLY solution to a view and query it the same way as you used to, an example of this here. All above examples in one DB<>Fiddle here.


End solution comment by MichaelB

Thanks Randi, I was able to rewrite my logic into a select subquery using the idea of your cross apply. As an added bonus, if I don't mention the field in the columns I select, I do not get a performance hit.

DB<>Fiddle

create or alter view dbo.v_example2
with schemabinding as
select example_id
      ,transaction_dt
      ,latitude
      ,longitude
      ,latlong
      ,most_recent= (
          select a.transaction_dt
          intersect
          select top 5 b.transaction_dt
          from dbo.example b
          where a.latitude=b.latitude
                and a.longitude=b.longitude
          order by transaction_dt desc
      )
from dbo.example a;

&

select example_id
      ,transaction_dt
      ,latitude
      ,longitude
      ,latlong
from dbo.v_example2
where latlong.STDistance(geography::Point(40,-74,4326)) <=1609.344e1
      and most_recent is not null