Sql-server – Improving a DbGeography query

indexoptimizationperformancequery-performancespatialsql server

I am still new to database administration and I'm trying to optimise a search query.

I had a query which looked like this and took 5-15 seconds to execute in some cases, and also was causing 100% CPU usage:

DECLARE @point geography;
SET @point = geography::STPointFromText('POINT(3.3109015 6.648294)', 4326); 

SELECT TOP (1)
     [Result].[PointId] AS [PointId], 
     [Result].[PointName] AS [PointName], 
     [Result].[LegendTypeId] AS [LegendTypeId], 
     [Result].[GeoPoint] AS [GeoPoint]
FROM ( 
    SELECT 
        [Extent1].[GeoPoint].STDistance(@point) AS distance, 
        [Extent1].[PointId] AS [PointId], 
        [Extent1].[PointName] AS [PointName], 
        [Extent1].[LegendTypeId] AS [LegendTypeId], 
        [Extent1].[GeoPoint] AS [GeoPoint]
    FROM [dbo].[GeographyPoint] AS [Extent1]
    WHERE 18 = [Extent1].[LegendTypeId] 
)  AS [Result]
ORDER By [Result].distance ASC

This table has one clustered index on the PK, and one spatial Index on the geography type column.

enter image description here

So when I executed the above query, it was performing a scan operation.

enter image description here

So I created a non-clustered index on the LegendTypeId column:

CREATE NONCLUSTERED INDEX [GeographyPoint_LegendType_NonClustered] ON [dbo].[GeographyPoint]
(
    [LegendTypeId] ASC
)
INCLUDE (   [PointId],
    [PointName],
    [GeoPoint]) 
    WITH (PAD_INDEX = OFF, 
    STATISTICS_NORECOMPUTE = OFF,
    SORT_IN_TEMPDB = OFF, 
    DROP_EXISTING = OFF,
    ONLINE = OFF,
    ALLOW_ROW_LOCKS = ON, 
    ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO

and changed the query to:

DECLARE @point geography;
SET @point = geography::STPointFromText('POINT({0} {1})', 4326); 

 SELECT TOP (1) 
     [GeoPoint].STDistance(@point) AS distance, 
     [PointId], 
     [PointName],
     [LegendTypeId], 
     [GeoPoint]
     FROM [GeographyPoint]
 WHERE 18 = [LegendTypeId]
 ORDER By distance ASC

And now SQL Server performs a seek instead of the scan:

enter image description here

which in my view has increased the efficiency of the query, but when I deploy this to production, I still get the same results (high CPU usage, and average of 10 seconds to execute the query).

Note: No data is inserted, updated or removed from this table – only
search/read.

  1. Is it something I am doing wrong?

  2. How can I fix this?

EDIT

Index Seak details

enter image description here

EDIT 2:

I changed the query, to use the method: 'Nearest Neighbor' from link: https://msdn.microsoft.com/en-us/library/ff929109.aspx, and now this is the result, This Query is also taking 3-5 seconds for the search – similar to the Second Query, (but not tested on Production)

enter image description here

Spatial Index settings:

CREATE SPATIAL INDEX [SPATIAL_Point] ON [dbo].[GeographyPoint]
(
[GeoPoint]
)USING  GEOGRAPHY_GRID 
WITH (GRIDS =(LEVEL_1 = MEDIUM,LEVEL_2 = MEDIUM,LEVEL_3 = MEDIUM,LEVEL_4 = MEDIUM), 
CELLS_PER_OBJECT = 16, PAD_INDEX = OFF, 
STATISTICS_NORECOMPUTE = 
OFF, SORT_IN_TEMPDB = OFF,
 DROP_EXISTING = OFF, 
 ONLINE = OFF, 
 ALLOW_ROW_LOCKS = ON, 
 ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO

EDIT 3
I followed instructions by @MickyT, dropped the Index on [LegendTypeId],and executed the following query:

DECLARE @point geography;
SET @point = geography::STPointFromText('POINT(3.3109 6.6482)', 4326); 

SELECT TOP (1) 

    [PointId],
    [PointName],
    [LegendTypeId], 
    [GeoPoint]
FROM [GeographyPoint] WITH(INDEX(SPATIAL_Point))
WHERE 
   [GeoPoint].STDistance(@point) IS NOT NULL AND
    18 = [LegendTypeId]
ORDER By [GeoPoint].STDistance(@point) ASC
OPTION(MAXDOP 1)

Statistics for this query are

enter image description here

And then I executed this Query again:

DECLARE @point geography;
SET @point = geography::STPointFromText('POINT(3.3109 6.6482)', 4326); 

 SELECT TOP (1) 
     [GeoPoint].STDistance(@point) AS distance, 
     [PointId], 
     [PointName],
     [LegendTypeId], 
     [GeoPoint]
     FROM [GeographyPoint] --WITH(INDEX(SPATIAL_Point))
 WHERE 18 = [LegendTypeId]
 ORDER By distance ASC

Statistics for this query are

enter image description here

Best Answer

I used the following setup to run some tests against.

CREATE TABLE GeographyPoint (
    ID INTEGER IDENTITY(1,1) NOT NULL PRIMARY KEY,
    GeoPoint GEOGRAPHY NOT NULL,
    LegendTypeID INTEGER NOT NULL
    );

INSERT INTO GeographyPoint (GeoPoint, LegendTypeID)
SELECT TOP 1000000 
    Geography::Point(RAND(CAST(NEWID() AS VARBINARY(MAX))) * 2,RAND(CAST(NEWID() AS VARBINARY(MAX))) * 2,4326),
    CAST(RAND(CAST(NEWID() AS VARBINARY(MAX))) * 25 AS INTEGER)
FROM Tally;

CREATE INDEX GP_IDX1 ON GeographyPoint(LegendTypeID) INCLUDE (ID, GeoPoint);
CREATE SPATIAL INDEX GP_SIDX ON GeographyPoint(GeoPoint) USING GEOGRAPHY_AUTO_GRID;

This gives a table of 1,000,000 random points with a 2 x 2 degree spread.
After trying a few different options on it, the best performance I could get was forcing it to use the spatial index. There was a couple of ways to achieve this. Dropping the index on LegendTypeID or using a hint.
You will need to decide which is best for your situation. Personally I don't like using index hints and would drop the other index if it is not required for other queries.

The queries stacked up against each other

DECLARE @point geography;
SET @point = geography::Point(1,1,4326); 
/*
Clustered index scan (PK)
 SQL Server Execution Times:
   CPU time = 641 ms,  elapsed time = 809 ms
*/
SELECT TOP (1) 
    [GeoPoint].STDistance(@point) AS distance, 
    [ID], 
    [LegendTypeId], 
    [GeoPoint]
FROM [GeographyPoint]
WHERE 18 = [LegendTypeId]
ORDER By distance ASC
OPTION(MAXDOP 1)
/*
Index Seek NonClustered (GP_IDX1)
 SQL Server Execution Times:
   CPU time = 2250 ms,  elapsed time = 2806 ms
*/
SELECT TOP (1) 
    [GeoPoint].STDistance(@point) AS distance, 
    [ID], 
    [LegendTypeId], 
    [GeoPoint]
FROM [GeographyPoint]
WHERE [GeoPoint].STDistance(@point) IS NOT NULL AND
    18 = [LegendTypeId]
ORDER By [GeoPoint].STDistance(@point) ASC
OPTION(MAXDOP 1)

/*
For the next 2 queries
Clustered Index Seek (Spatial)
 SQL Server Execution Times:
   CPU time = 15 ms,  elapsed time = 11 ms
*/
SELECT TOP (1) 
    [GeoPoint].STDistance(@point) AS distance, 
    [ID], 
    [LegendTypeId], 
    [GeoPoint]
FROM [GeographyPoint] WITH(INDEX(GP_SIDX))
WHERE [GeoPoint].STDistance(@point) IS NOT NULL AND
    18 = [LegendTypeId]
ORDER By [GeoPoint].STDistance(@point) ASC
OPTION(MAXDOP 1)

DROP INDEX GP_IDX1 ON [GeographyPoint]

SELECT TOP (1) 
    [GeoPoint].STDistance(@point) AS distance, 
    [ID], 
    [LegendTypeId], 
    [GeoPoint]
FROM [GeographyPoint]
WHERE [GeoPoint].STDistance(@point) IS NOT NULL AND
    18 = [LegendTypeId]
ORDER By [GeoPoint].STDistance(@point) ASC
OPTION(MAXDOP 1)