Oracle PL/SQL – Find Nearest Object

nearest neighbororacle-12cplsql

There are two tables containing objects and their coordinates. One contains hotels, the other contains restaurants. How to build a table that for each hotel lists nearest restaurant name and the distance?

WITH restaurant AS (
           SELECT 'A' r_name, 0.0 x, 0.0 y FROM dual UNION ALL
           SELECT 'B' r_name, 0.0 x, 6.0 y FROM dual UNION ALL
           SELECT 'C' r_name, 9.5 x, 0.5 y FROM dual UNION ALL
           SELECT 'D' r_name, 8.2 x, 4.7 y FROM dual -- etc           
           ),

     hotel AS ( 
           SELECT 'First_Hotel' h_name, 1.0 x, 6.0 y FROM dual UNION ALL  
           SELECT 'Second_Hotel' h_name, -4.0 x, 3.0 y FROM dual -- etc
           )               

     select h.h_name, r.r_name as nearest_restaurant, distance from restaurant r, hotel h  /* some query */

Should return

 hotel          nearest_restaurant    distance
 First_Hotel       B                    1     -- B is the nearest restaurant to First_Hotel. Distance is sqrt((1.0-0.0)^2+(6.0-6.0)^2)=1
 Second_Hotel      A                    5     --if both A and B on the same distance choose based on r_name

Edited: I also have geoloc for both tables with x and y coordinates. Can geoloc help to build the query?

Best Answer

This query worked for me. It uses GEOLOC from Oracle Spartial

    create table my_result_table as  (
    select n.nodeid, n.geoloc.sdo_point.y as node_y, n.geoloc.sdo_point.x as node_x, 
        r.geoloc.sdo_point.y as sl_y, r.geoloc.sdo_point.x as sl_x,
        sdo_nn_distance(1)  as sld
       from  my_nodes n,  my_restaurants r
       where
       sdo_nn   ( r.GEOLOC, n.geoloc,  'Unit = MILE sdo_num_res = 1', 1) = 'TRUE'
        );

Note that first argument is restaurants, not nodes.

Related Solutions

Mysql find 10 nearest point from given lat/long per type

Searching for the nearest 10 points of interest gives, usually, near points. In that case the usual cartesian distance (SQRT((LAT1 - LAT2)^2 + (LONG1 - LONG2)^2)) can be used.

Given a generic table (for the definition of latitude and longitude I've used this pointer.)

CREATE TABLE POI (
   ID INT IDENTITY(1, 1)
 , `Name` Varchar(100)
 , `Type` CHAR(1)
 , `LAT`  Decimal(10, 8)
 , `LNG`  Decimal(11, 8)
)

The query would be something like

DECLARE myLAT Decimal(10, 8) DEFAULT 45.48561;
DECLARE myLNG Decimal(11, 8) DEFAULT 123.47981;

(SELECT `Name`, `LAT`, `LNG`
 FROM   POI
 WHERE  `Type` = 'A'
 ORDER BY SQRT(POWER(myLAT - `LAT`), 2) + POWER(myLNG - `LNG`), 2)
 LIMIT 10)
UNION ALL
(SELECT `Name`, `LAT`, `LNG`
 FROM   POI
 WHERE  `Type` = 'B'
 ORDER BY SQRT(POWER(myLAT - `LAT`), 2) + POWER(myLNG - `LNG`), 2)
 LIMIT 10)
UNION ALL
(SELECT `Name`, `LAT`, `LNG`
 FROM   POI
 WHERE  `Type` = 'C'
 ORDER BY SQRT(POWER(myLAT - `LAT`), 2) + POWER(myLNG - `LNG`), 2)
 LIMIT 10)

In case the point of interest are few and far from the 'here' point and from themselves you have to use a different formula for the distance, to account that the earth is, more or less, a sphere.

Sql-server – Query Spatial Data for Nearest Neighbor for each group

I'm going to go out on a bit of a limb here and guess what you are trying to achieve.

I suspect that you want to find the nearest neighbour from within the DS1 table for all the rows in DS1.

For a test set I created the following randomly filled table.

-- Test table
CREATE TABLE DS1 ( 
  MSPID INTEGER IDENTITY(1,1) NOT NULL PRIMARY KEY,
  Location Geometry NOT NULL
);

-- 1 million random points
INSERT INTO DS1 (Location)
SELECT TOP 1000000 
  Geometry::Point(
    RAND(CAST(NEWID() AS VARBINARY))*100000,
    RAND(CAST(NEWID() AS VARBINARY))*100000,
    0) Location
FROM TALLY;

CREATE SPATIAL INDEX DS1_SIDX ON DS1 (Location) 
USING GEOMETRY_AUTO_GRID 
WITH (BOUNDING_BOX =(-15000, -15000, 2015000, 2015000));

I a version of your initial query for a quick test, but unfortunately on my desktop the optimiser decide that parallel was better and ignored the spatial index resulting in a 3 second query. Restricting it to a single core caused it to use the index and returned in milliseconds.

-- Nearest neighbour test
DECLARE @Location Geometry = Geometry::Point(50000,50000,0);

/* Ignored index
 SQL Server Execution Times:
   CPU time = 21781 ms,  elapsed time = 3805 ms.
*/
SELECT TOP 1  
  MSPID, 
  Location.STDistance(@location) Distance
FROM DS1
WHERE Location.STDistance(@location) IS NOT NULL
ORDER BY Location.STDistance(@location)

/* Used index
 SQL Server Execution Times:
   CPU time = 0 ms,  elapsed time = 10 ms.
*/
SELECT TOP 1  
  MSPID, 
  Location.STDistance(@location) Distance
FROM DS1
WHERE Location.STDistance(@location) IS NOT NULL
ORDER BY Location.STDistance(@location)
OPTION (MAXDOP 1);

To achieve what you want for a single row from DS1 and assuming that you want the nearest row from DS1 to the one you've picked, the following uses the index for me and again works in milliseconds.

-- For a specific row using cross apply
DECLARE @MSPID INTEGER = 500000;

SELECT 
  a.MSPID FromMSPID,
  b.MSPID ToMSPID,
  b.Distance
FROM DS1 a
  CROSS APPLY (
    SELECT TOP 1  
      MSPID, 
      c.Location.STDistance(a.Location) Distance
    FROM DS1 c
    WHERE c.Location.STDistance(a.Location) IS NOT NULL AND
      c.MSPID <> a.MSPID
    ORDER BY c.Location.STDistance(a.Location)
    ) b
WHERE a.MSPID = @MSPID
OPTION (MAXDOP 1);

This can be made to go through the entire table, but it will take some time to complete as it needs to find the nearest neighbour for each one. ~~It took ~1 minute for 10,000, so for the full million that would take almost 2 hours.~~ I have found that restricting this query to a single core isn't required and allowing parallism will improve performance. For me it halved the time. I would also recommend that you have a look at this question and it's answer for other performance considerations.

/* For all (well 10,000) using cross apply
 SQL Server Execution Times:
   CPU time = 60641 ms,  elapsed time = 64545 ms.
*/
SELECT TOP 10000
  a.MSPID FromMSPID,
  b.MSPID ToMSPID,
  b.Distance
FROM DS1 a
  CROSS APPLY (
    SELECT TOP 1  
      MSPID, 
      c.Location.STDistance(a.Location) Distance
    FROM DS1 c
    WHERE c.Location.STDistance(a.Location) IS NOT NULL AND
      c.MSPID <> a.MSPID
    ORDER BY c.Location.STDistance(a.Location)
    ) b
OPTION (MAXDOP 1);

Best Answer

Related Solutions

Mysql find 10 nearest point from given lat/long per type

Sql-server – Query Spatial Data for Nearest Neighbor for each group

Related Question