PostGIS Performance – Comparing && vs ST_Intersects

postgispostgresqlspatial

I am trying to figure out points reside inside an rectangular area (envelope). I am having a little bit difficulty understanding performance implications of using && operator compared to ST_Intersect.

However I think I explained my own question while trying to formulate my question. I submit it anyway in case it would be useful to somebody.

The manual for && says (ironically the page name is geometry_overlaps.html):

&& — Returns TRUE if A's 2D bounding box intersects B's 2D boundingbox.`

The manual for ST_Intersects says:

Returns TRUE if the Geometries/Geography "spatially intersect in 2D" – (share any portion of space) and FALSE if they don't (they are Disjoint). For geography — tolerance is 0.00001 meters (so any points that close are considered to intersect)

ST_Intersects will return true in such cases. For my objective && operator does almost same thing since I am using a rectangular bounding box in either case. I am wondering if && is the fastest operator for my purpose? It feels like && has to do less checks so it must be much more efficient.

Here is an example which is direct copy from &&, ST_Intersects adaptation:

SELECT
        t1.id AS t1,
        t2.id AS t2,
        t1.ln && t2.ln AS "&&",
        ST_Intersects(t1.ln,t2.ln)
FROM ( VALUES
        (1, 'LINESTRING(0 0, 3 3)'::geometry),
        (2, 'LINESTRING(0 1, 0 5)'::geometry)
) AS t1(id,ln)
CROSS JOIN (VALUES
        (3, 'LINESTRING(1 2, 4 6)'::geometry)
) AS t2(id,ln);

 t1 | t2 | && | st_intersects 
----+----+----+---------------
  1 |  3 | t  | f
  2 |  3 | f  | f
(2 rows)

Below is a simple graph of how these lines should look like. I just added plotted this to see how bounding box works exactly for lines. For my purposes, the points are always inside the box (while this line example does not really reflect what I am trying to do but good for illustration)

enter image description here

Question 1 is if && is significantly faster than ST_Intersects when using with ST_MakeEnvelope (a rectangular boundary), when finding points inside a rectangular bounding box.

Question 2 is Also am I understanding correctly that when checking points inside a rectangular boundary && does exactly same thing as ST_Intersects?

Best Answer

Background, functionality and performance

&& opperator

&& is bounding-box overlaps. All operators call functions in PostgreSQL: you can see this \doS+ && in this case && literally calls the PostGIS function geometry_overlaps. The only catch here is that && will make use of an index, from the docs

In general, you will want to use the "intersects operator" (&&) which tests whether the bounding boxes of features intersect. The reason the && operator is useful is because if a spatial index is available to speed up the test, the && operator will make use of this. This can make queries much much faster.

You can see in the definition of geometry_overlaps that it calls an internal C function gserialized_overlaps_2d. The function gserialized_overlaps_2d uses 4 comparisons to determine whether or not there is an overlap in the bounding box. That's not usually all that useful except for adding selectivity, so you don't normally want it.

That means this isn't a performance question, && just doesn't do much. However what && does do can make use of a GIST index.

ST_Intersects

ST_Intersects from the docs,

This function call will automatically include a bounding box comparison that will make use of any indexes that are available on the geometries.

The reason why is simple, only the bounding box uses the index. That means it'll do a && AND someting else. And you can see that with \dfS+ st_intersects

SELECT $1 && $2 AND _ST_Intersects($1,$2);

So the extra bit it does is call either geos_intersects or sfcgal_intersects intersects depending on your chosen back end. in the best case, that you get geos_intersects, you can see what that does here.

In essence, it is telling you if any point intersects without making any assumptions (other than floating point math).

Mixed SRIDs

As a last note, it maybe be worth noting that these two operations handle mixed SRIDs differnetly.

Your questions

Question 1 is if && is significantly faster than ST_Intersects when using with ST_MakeEnvelope (a rectangular boundary), when finding points inside a rectangular bounding box.

Yes, it's faster -- significantly. It does less. It doesn't find "points" inside a rectangular bounding box unless one side is a simple point. Other than that, it finds bounding-box overlaps which are subject to false positives if all points reside outside of the bounding box.

Question 2 is Also am I understanding correctly that when checking points inside a rectangular boundary && does exactly same thing as ST_Intersects?

No. It should be clear why now.