I have this query:
SELECT *
FROM location
WHERE to_tsvector('simple',unaccent2("city"))
@@ to_tsquery('simple',unaccent2('wroclaw'))
order by displaycount
I'm happy with it:
"Sort (cost=3842.56..3847.12 rows=1826 width=123) (actual time=1.915..2.084 rows=1307 loops=1)"
" Sort Key: displaycount"
" Sort Method: quicksort Memory: 206kB"
" -> Bitmap Heap Scan on location (cost=34.40..3743.64 rows=1826 width=123) (actual time=0.788..1.208 rows=1307 loops=1)"
" Recheck Cond: (to_tsvector('simple'::regconfig, unaccent2((city)::text)) @@ '''wroclaw'''::tsquery)"
" -> Bitmap Index Scan on location_lower_idx (cost=0.00..33.95 rows=1826 width=0) (actual time=0.760..0.760 rows=1307 loops=1)"
" Index Cond: (to_tsvector('simple'::regconfig, unaccent2((city)::text)) @@ '''wroclaw'''::tsquery)"
"Total runtime: 2.412 ms"
But when I add LIMIT, the execution takes more than 2 seconds:
SELECT *
FROM location
WHERE to_tsvector('simple',unaccent2("city"))
@@ to_tsquery('simple',unaccent2('wroclaw'))
order by displaycount
limit 20
Explain:
"Limit (cost=0.00..1167.59 rows=20 width=123) (actual time=2775.452..2775.643 rows=20 loops=1)"
" -> Index Scan using location_displaycount_index on location (cost=0.00..106601.25 rows=1826 width=123) (actual time=2775.448..2775.637 rows=20 loops=1)"
" Filter: (to_tsvector('simple'::regconfig, unaccent2((city)::text)) @@ '''wroclaw'''::tsquery)"
"Total runtime: 2775.693 ms"
I think that it's some issue with ORDER BY and LIMIT. How can I force PostgreSQL to use the index and do the ordering at the end?
Subquery doesn't help:
SELECT *
FROM (
SELECT *
FROM location
WHERE to_tsvector('simple',unaccent2("city"))
@@ to_tsquery('simple',unaccent2('wroclaw'))
order by displaycount
) t
LIMIT 20;
or:
SELECT *
FROM (
SELECT *
FROM location
WHERE to_tsvector('simple',unaccent2("city"))
@@ to_tsquery('simple',unaccent2('wroclaw'))
) t
order by displaycount
LIMIT 20;
Best Answer
My guess is that this would fix your query:
I repeat the
WHERE
condition as first element of theORDER BY
clause - which is logically redundant, but should keep the query planner from assuming it would be better off processing rows according to the indexlocation_displaycount_index
- which turns out to be much more expensive.The underlying problem is that the query planner obviously grossly misjudges the selectivity and / or cost of your
WHERE
condition. I can only speculate why that is.Do you have autovacuum running - which should also take care of running
ANALYZE
on your tables? Thereby, are your table-statistics up-to-date? Any effect if you run:And try again?
It can also be that the selectivity of the
@@
operator is being misjudged. I would imagine that it is very hard to estimate for logical reasons.If my query should not solve the problem, and generally to verify the underlying theory do one of these two things:
temporarily delete the index
location_displaycount_index
temporarily disable basic index scans by running:
The latter is less intrusive and only affects the current session. It leaves the methods
bitmap heap scan
andbitmap index scan
open, which are used by the faster plan.Then re-run the query.
BTW: If the theory is sound, your query (as you have it now) will be much faster with a less selective search term in the FTS condition - contrary to what you might expect. Try it.