Mysql find 10 nearest point from given lat/long per type

MySQLnearest neighbor

I have a database given with 3 different types of locations A, B and C. These points have latitude and longitude information with them.

Now I have a random lat-long given and I want to find the closest 10 points per type to that given location – how can I do that? Anyone has an idea?`

Thanks for any help

Best Answer

Searching for the nearest 10 points of interest gives, usually, near points. In that case the usual cartesian distance (SQRT((LAT1 - LAT2)^2 + (LONG1 - LONG2)^2)) can be used.

Given a generic table (for the definition of latitude and longitude I've used this pointer.)

CREATE TABLE POI (
   ID INT IDENTITY(1, 1)
 , `Name` Varchar(100)
 , `Type` CHAR(1)
 , `LAT`  Decimal(10, 8)
 , `LNG`  Decimal(11, 8)
)

The query would be something like

DECLARE myLAT Decimal(10, 8) DEFAULT 45.48561;
DECLARE myLNG Decimal(11, 8) DEFAULT 123.47981;

(SELECT `Name`, `LAT`, `LNG`
 FROM   POI
 WHERE  `Type` = 'A'
 ORDER BY SQRT(POWER(myLAT - `LAT`), 2) + POWER(myLNG - `LNG`), 2)
 LIMIT 10)
UNION ALL
(SELECT `Name`, `LAT`, `LNG`
 FROM   POI
 WHERE  `Type` = 'B'
 ORDER BY SQRT(POWER(myLAT - `LAT`), 2) + POWER(myLNG - `LNG`), 2)
 LIMIT 10)
UNION ALL
(SELECT `Name`, `LAT`, `LNG`
 FROM   POI
 WHERE  `Type` = 'C'
 ORDER BY SQRT(POWER(myLAT - `LAT`), 2) + POWER(myLNG - `LNG`), 2)
 LIMIT 10)

In case the point of interest are few and far from the 'here' point and from themselves you have to use a different formula for the distance, to account that the earth is, more or less, a sphere.

Related Solutions

Mysql – How to Best Implement nearest neighbour search in thesql

As far as design patterns, the Yelp question is pretty standard stuff.

For a more complex answer, you will probably need the geospatial distance. Here is a fascinating powerpoint about that topic (and here is a pdf version of that as well). However, the math involved is quite ugly.

From their slide:

set @orig_lat=122.4058; set @orig_lon=37.7907;
set @dist=10;

SELECT *, 3956 * 2 * ASIN(SQRT(
POWER(SIN((@orig_lat - abs(dest.lat)) * pi()/180 / 2), 2) +  COS(@orig_lat * pi()/180 ) * COS(abs(dest.lat) * pi()/180) *  POWER(SIN((@orig_lon – dest.lon) * pi()/180 / 2), 2) )) as  distance
FROM hotels dest 
having distance < @dist
ORDER BY distance limit 10

There's a longer, more in-depth answer about geospatial distance on Stack Overflow.

But you still want to limit the results by latitude and longitude.

Ultimately, I would avoid the POINT datatype and go with latitude/longitude. There's currently no way to determine the distance between two POINTs, so you're going to have to store latitude/longitude for that calculation anyways.

One last link: you may also want to check out this SO thread regarding speeding up the queries using spatial indexes.

Mysql – How does compound index help speed up query

Indexes are in most cases a B-Tree structure (or some sort of). There are DBMS that supported different indexing types.

As you are talking about longitude/latitude PostgreSQL with it's GIST indexes come to mind. Oracle has Bitmap indexes in addition to the B-Tree indexes. I'm sure SQL Server and DB2 also have some special index types. And then there are full text indexes which make searching text very efficient.

A B-Tree index is very efficient in finding a specific value - think of a primary key index where all values are different. If the index only contains the PK column(s) (i.e. it is not a clustered index) then typically looking up a row by a specific PK values takes not more than (roughly) 3-4 IO operations (at least with Oracle). 2-3 to find the index block and an additional one to read the whole row. This gets more efficient if the index contains additional columns so that the lookup of the actual table row is not needed. The term for that is "covering index" or "index only retrieval".

Now for doing "range lookups" (e.g. where foo > 42) an index is very helpful as well as in most DBMS the index can also be scanned according to a predicate. Usually (again this highly depends on the DBMS) this is slightly less efficient than a direct lookup (again this also depends on the ability to do an "index only retrieval").

I don't know of any BMS which can not use more than one index in a query. Think a join on a PK and a FK column - depending on the data distribution the DBMS might use the index to find the parent rows (PK lookup) and the child rows (FK lookup).

But not all DBMS can use more than one index for the same table in a single query.

After all whether or not an index is being used or not depends on a lot of things.

I can highly recommend http://use-the-index-luke.com/ which is a very good introduction on indexing across all major DBMS.

DBMS specific information:

Oracle: http://docs.oracle.com/cd/E11882_01/server.112/e25789/indexiot.htm
PostgreSQL: http://www.postgresql.org/docs/current/static/indexes.html

Best Answer

Related Solutions

Mysql – How to Best Implement nearest neighbour search in thesql

Mysql – How does compound index help speed up query

Related Question