I have this query (called from PHP, hence the $
variables):
SELECT a.name,
St_AsText(a.geom),
b.name AS name_b,
b.id
FROM a
LEFT JOIN b ON b.id = a.b_id
WHERE UNACCENT(LOWER(a.name)) LIKE UNACCENT('$searched%')
ORDER BY a.geom <-> 'SRID=4326;POINT($lat $lon)'::geometry
LIMIT 2;
There is lower(a.name)
BTree index on a.name
, GIST index on a.geom
and b.id
is primary key
It runs about 40ms for $searched = "Maria%"
. Now, if I set $searched = "Mariasomething some%"
, time goes up to 2s.
Why? Can it be speed up or perform in a more stable manner?
Best Answer
Explain
We would have to see table definition, cardinalities and the
EXPLAIN
output to be certain, but the reason is most likely this:Only your spatial GiST index on
a.geom
can be used. The btree index is not applicable. Postgres walks through the "closest" rows until it finds the first two matching your predicate.Normally, more restrictive conditions make queries faster, since fewer rows have to be fetched (and sorted). But not in this constellation: Fewer qualifying rows don't help a nearest neighbour search with a small
LIMIT
- au contrair. There are two options:The "nearest neighbour" search based on the GiST index - which is fast as long as some of the closest rows qualify - fastest if you remove the
WHERE
clause altogether, try it! It gets increasingly expensive if many rows have to be filtered. Basically: the longer your pattern the more expensive the query.A sequential scan, which has to solve the same problem for selective conditions: filter many rows until finding some candidates. The method is faster, but the whole table has to be read, and if more than a few rows are found, the final sort by distance gets expensive.
Basically, Postgres is stuck between a rock and a hard place with this wicked query.
Fix
You would need an index on
unaccent(lower(name))
which is not possible, becauseunaccent()
is onlySTABLE
, notIMMUTABLE
. But you can work around this limitation with anIMMUTABLE
function wrapper - done right:The best index still seems tricky and depends on your complete situation.
Since you match left-anchored patterns, I would try a separate btree index with the
text_pattern_ops
operator class, using the functionf_unaccent()
outlined in the linked answer above:More:
Then:
WHERE
condition to match the indexed expressionI use dollar-quoting to avoid escaping single quotes.
For repeated calls use
pg_prepare
andpg_execute
. Details:Or you could try a GiST index - a multicolumn, functional GiST index combining a PostGis spatial GiST index and a functional trigram GiST index. I didn't test that combination, yet.
You need to install pg_trgm first. Instructions:
I put
f_unaccent(lower(name))
first because, per documentation: