Sql-server – Improving Speed of Cross Apply

spatialsql servert-sql

I posted a question on how to create a relation between 2 tables based on a spatial field
https://stackoverflow.com/questions/63769885/t-sql-passing-column-to-subquery-for-filtering

The performance is terrible when the city has only 5k records and listing has 12k
(takes 54 seconds to run on sql server 2019, dual core i5)

Anything I can do to improve it? Execution plan is https://www.brentozar.com/pastetheplan/?id=BkiRk-74P

2nd Plan (based on the query in the answer by Kumar)
https://www.brentozar.com/pastetheplan/?id=HJSLs_L4D

This is the query

select l.*, city.*
from  listings l
cross apply (
     select  top (1)  c.UnicodeName, c.name, r.code as region, cn.code as country from cities c
     inner join regions r on r.regionid = c.regionid
     inner join Countries cn on cn.CountryId = r.countryid
     where c.location.STDistance(l.location) is not null
     order by c.Location.STDistance(l.location) 
 ) as city

Indexes on cities and listings

CREATE SPATIAL INDEX [256_HHHH] ON [dbo].[Listings]
(
    [Location]
)USING  GEOGRAPHY_GRID 
WITH (GRIDS =(LEVEL_1 = HIGH,LEVEL_2 = HIGH,LEVEL_3 = HIGH,LEVEL_4 = HIGH), 
CELLS_PER_OBJECT = 256, PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO

CREATE SPATIAL INDEX [16_HHHH] ON [dbo].[Listings]
(
    [Location]
)USING  GEOGRAPHY_GRID 
WITH (GRIDS =(LEVEL_1 = HIGH,LEVEL_2 = HIGH,LEVEL_3 = HIGH,LEVEL_4 = HIGH), 
CELLS_PER_OBJECT = 16, PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO

Updated Query based on Kumar's Suggestion, which does not bring back any results. it seems this code is assuming that listing and city have the exact same position. They dont, I just want to pick the nearest city for each listing.
I also had to change = to STEquals as there were syntax errors otherwise.

;With CTE as
(
select l.*, c.RegionId, c.UnicodeName, c.name,c.location.STDistance(l.location) Locationdistance
from  listings l
inner join  cities c on l.location.STEquals(c.Location)=1

)

select c.*,r.code as region, cn.code as country
from CTE C
inner join regions r on r.regionid = c.RegionId
     inner join Countries cn on cn.CountryId = r.countryid
     where Locationdistance is not null
order by Locationdistance

Best Answer

Always verify if the join is returing correct number of rows from each join.

Try this,

;With CTE as
(
 select l.*, city.*
from  listings l
cross apply (
     select  top (1)  c.UnicodeName, c.name,c.regionid
     --, r.code as region, cn.code as country 
     from cities c
     --inner join regions r on r.regionid = c.regionid
     --inner join Countries cn on cn.CountryId = r.countryid
     where c.location.STDistance(l.location) is not null
     order by c.Location.STDistance(l.location) 
 ) as city
 )

 select c.* 
  ,r.code as region, cn.code as country 
 from CTE C
 inner join regions r on r.regionid = c.regionid
 inner join Countries cn on cn.CountryId = r.countryid
 OPTION (MAXDOP 1, RECOMPILE);

Has Estimated number of rows decrease.
Is difference between Actual number of rows and Estimated number of rows decrease.