Searching for the nearest 10 points of interest gives, usually, near points. In that case the usual cartesian distance (SQRT((LAT1 - LAT2)^2 + (LONG1 - LONG2)^2)
) can be used.
Given a generic table (for the definition of latitude and longitude I've used this pointer.)
CREATE TABLE POI (
ID INT IDENTITY(1, 1)
, `Name` Varchar(100)
, `Type` CHAR(1)
, `LAT` Decimal(10, 8)
, `LNG` Decimal(11, 8)
)
The query would be something like
DECLARE myLAT Decimal(10, 8) DEFAULT 45.48561;
DECLARE myLNG Decimal(11, 8) DEFAULT 123.47981;
(SELECT `Name`, `LAT`, `LNG`
FROM POI
WHERE `Type` = 'A'
ORDER BY SQRT(POWER(myLAT - `LAT`), 2) + POWER(myLNG - `LNG`), 2)
LIMIT 10)
UNION ALL
(SELECT `Name`, `LAT`, `LNG`
FROM POI
WHERE `Type` = 'B'
ORDER BY SQRT(POWER(myLAT - `LAT`), 2) + POWER(myLNG - `LNG`), 2)
LIMIT 10)
UNION ALL
(SELECT `Name`, `LAT`, `LNG`
FROM POI
WHERE `Type` = 'C'
ORDER BY SQRT(POWER(myLAT - `LAT`), 2) + POWER(myLNG - `LNG`), 2)
LIMIT 10)
In case the point of interest are few and far from the 'here' point and from themselves you have to use a different formula for the distance, to account that the earth is, more or less, a sphere.
I'm going to go out on a bit of a limb here and guess what you are trying to achieve.
I suspect that you want to find the nearest neighbour from within the DS1 table for all the rows in DS1.
For a test set I created the following randomly filled table.
-- Test table
CREATE TABLE DS1 (
MSPID INTEGER IDENTITY(1,1) NOT NULL PRIMARY KEY,
Location Geometry NOT NULL
);
-- 1 million random points
INSERT INTO DS1 (Location)
SELECT TOP 1000000
Geometry::Point(
RAND(CAST(NEWID() AS VARBINARY))*100000,
RAND(CAST(NEWID() AS VARBINARY))*100000,
0) Location
FROM TALLY;
CREATE SPATIAL INDEX DS1_SIDX ON DS1 (Location)
USING GEOMETRY_AUTO_GRID
WITH (BOUNDING_BOX =(-15000, -15000, 2015000, 2015000));
I a version of your initial query for a quick test, but unfortunately on my desktop the optimiser decide that parallel was better and ignored the spatial index resulting in a 3 second query. Restricting it to a single core caused it to use the index and returned in milliseconds.
-- Nearest neighbour test
DECLARE @Location Geometry = Geometry::Point(50000,50000,0);
/* Ignored index
SQL Server Execution Times:
CPU time = 21781 ms, elapsed time = 3805 ms.
*/
SELECT TOP 1
MSPID,
Location.STDistance(@location) Distance
FROM DS1
WHERE Location.STDistance(@location) IS NOT NULL
ORDER BY Location.STDistance(@location)
/* Used index
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 10 ms.
*/
SELECT TOP 1
MSPID,
Location.STDistance(@location) Distance
FROM DS1
WHERE Location.STDistance(@location) IS NOT NULL
ORDER BY Location.STDistance(@location)
OPTION (MAXDOP 1);
To achieve what you want for a single row from DS1 and assuming that you want the nearest row from DS1 to the one you've picked, the following uses the index for me and again works in milliseconds.
-- For a specific row using cross apply
DECLARE @MSPID INTEGER = 500000;
SELECT
a.MSPID FromMSPID,
b.MSPID ToMSPID,
b.Distance
FROM DS1 a
CROSS APPLY (
SELECT TOP 1
MSPID,
c.Location.STDistance(a.Location) Distance
FROM DS1 c
WHERE c.Location.STDistance(a.Location) IS NOT NULL AND
c.MSPID <> a.MSPID
ORDER BY c.Location.STDistance(a.Location)
) b
WHERE a.MSPID = @MSPID
OPTION (MAXDOP 1);
This can be made to go through the entire table, but it will take some time to complete as it needs to find the nearest neighbour for each one. It took ~1 minute for 10,000, so for the full million that would take almost 2 hours. I have found that restricting this query to a single core isn't required and allowing parallism will improve performance. For me it halved the time. I would also recommend that you have a look at this question and it's answer for other performance considerations.
/* For all (well 10,000) using cross apply
SQL Server Execution Times:
CPU time = 60641 ms, elapsed time = 64545 ms.
*/
SELECT TOP 10000
a.MSPID FromMSPID,
b.MSPID ToMSPID,
b.Distance
FROM DS1 a
CROSS APPLY (
SELECT TOP 1
MSPID,
c.Location.STDistance(a.Location) Distance
FROM DS1 c
WHERE c.Location.STDistance(a.Location) IS NOT NULL AND
c.MSPID <> a.MSPID
ORDER BY c.Location.STDistance(a.Location)
) b
OPTION (MAXDOP 1);
Best Answer
This query worked for me. It uses GEOLOC from Oracle Spartial
Note that first argument is restaurants, not nodes.