Spatialite: use of virtual spatial index for distances

spatialsqlitesubquery

I'm currently learning to use spatial access methods for query optimization. I'm going through the examples in spatialite cookbook and got stuck in here
http://www.gaia-gis.it/gaia-sins/spatialite-cookbook/html/pp-adjacent.html

According to the tutorial, in order to find the pairs of populated places that their distance is < 1km we have to do this query

    SELECT pp1.name AS "PopulatedPlace #1",
      GeodesicLength(MakeLine(pp1.geometry, pp2.geometry)) AS "Distance (meters)",
      pp2.name AS "PopulatedPlace #2"
    FROM populated_places AS pp1,
    populated_places AS pp2
    WHERE GeodesicLength(
      MakeLine(pp1.geometry, pp2.geometry)) < 1000.0
      AND pp1.id <> pp2.id
    AND pp2.ROWID IN (
      SELECT pkid
      FROM idx_populated_places_geometry
      WHERE pkid MATCH RTreeDistWithin(
       ST_X(pp1.geometry),
       ST_Y(pp1.geometry), 0.02))
    ORDER BY 2;

Which throws an error because of no longer use of geocallback functions RTree***. I checked the update on this and this has been substituted by the use of virtual spatial index. We have to use a subquery of the form

    SELECT ROWID
    FROM SpatialIndex
    WHERE
       f_table_name = <table_name>
       AND search_frame = <some_geometry>

So i tested this

    SELECT pp1.name AS "PopulatedPlace #1",
      GeodesicLength(MakeLine(pp1.geometry, pp2.geometry)) AS "Distance (meters)",
      pp2.name AS "PopulatedPlace #2"
    FROM populated_places AS pp1,
    populated_places AS pp2
    WHERE GeodesicLength(
      MakeLine(pp1.geometry, pp2.geometry)) < 1000.0
      AND pp1.id <> pp2.id
    AND pp2.ROWID IN (
      SELECT ROWID
      FROM SpatialIndex
      WHERE
       f_table_name ='populated_places'
       AND search_frame = pp1.geometry)
    ORDER BY 2;

And the result set was awfully wrong

   PopulatedPlace #1    Distance (meters)   PopulatedPlace #2
              Ariano    0.000000             Ariano
   Campolongo Maggiore  0.000000             Campolongo Maggiore
   Campolongo Maggiore  0.000000             Campolongo Maggiore
   Campolongo Maggiore  0.000000             Campolongo Maggiore
   Campolongo Maggiore  0.000000             Campolongo Maggiore
   Campolongo Maggiore  0.000000             Campolongo Maggiore
   Campolongo Maggiore  0.000000             Campolongo Maggiore
              Ariano    0.000000             Ariano

Could please someone explain this behavior and/or give an example of proper use of the virtual spatial index in order to obtain distance queries?

Best Answer

I know it is late but I recently stumbled accross the same problem. It is not working in your example because the search_frame parameter act as a filter and is equivalent to the function RTreeIntersect.

Here since you use a point it will only look for point wich intersect togetether (when using their bounding box). The idea is to use a bouding box centered on the pp1 point, with side the length of the desired distance.

SELECT pp1.name AS "PopulatedPlace #1",
  GeodesicLength(MakeLine(pp1.geometry, pp2.geometry)) AS "Distance (meters)",
  pp2.name AS "PopulatedPlace #2"
FROM populated_places AS pp1,
populated_places AS pp2
WHERE GeodesicLength(
  MakeLine(pp1.geometry, pp2.geometry)) < 1000.0
  AND pp1.id <> pp2.id
AND pp2.ROWID IN (
  SELECT ROWID
  FROM SpatialIndex
  WHERE
   f_table_name ='populated_places'
   AND search_frame = ST_Expand(pp1.geometry, 500.0)
ORDER BY 2;

To do that we simply use the function ST_Expand which return the bounding box envelopping the geometry (here it is the point) and expand it uniformly.

So we filter the search as follow : only pp2 point inside the square centered on pp1 with 1000 meter side length will be considered.