I would like to get results by query this:
SELECT * FROM (
SELECT id, subject
FROM mailboxes
WHERE tsv @@ plainto_tsquery('avail')
) AS t1 ORDER by id DESC;
This works and return rows with tsv
containing Available
. But if I use avai
(dropped lable
) it cannot find anything.
Do all queries have to be be in dictionary? Can't we just query such letters? I have a database that contains e-mail body (content) and I would like to make it fast as its grow every second. Currently I am using
... WHERE content ~* 'letters`
Best Answer
No. Because only word stems (according to the used text search configuration) are in the index to begin with. But more importantly:
No. Because, on top of that Full Text Search is also capable of prefix matching:
This would work:
Note 3 things:
Use
to_tsquery()
, notplainto_tsquery()
, in this case because (quoting the manual):Use the
'simple'
text search configuration to generate thetsquery
since you obviously want to take the word 'avail' as is and not apply stemming.Append
:*
to make it a prefix search, i.e find all lexemes starting with 'avail'.Important: This is a prefix search on lexemes (word stems) in the document. A regular expression match without wildcards (
content ~* 'avail'
) is not exactly the same! The latter is not left-anchored (to the start of lexemes) and would also find 'FOOavail' etc.It's unclear whether you want the behavior outlined in your query or the equivalent of the added regular expression. Trigram indexes (
pg_trgm
) like @Evan already suggested are the right tool for that. There are many related questions on dba.SE, try a search.Overview:
Demo
Related answer (see chapter "Different approach to optimize search"):
Emails?
Since you mentioned emails, be aware that the text search parser identifies emails and does not split them into separate words / lexemes. Consider:
I would replace the separators
@
and.
in your emails with space (' '
) to index contained words.Also, since you are dealing with names in emails, not with English (or some other language) words, I would use the
'simple'
text search configuration to disable stemming and other language features:Build the
ts_vector
column with: