I'd like to determine if a text input contains any keywords from a known list, but also show which ones are found.
E.g with Keywords: ['fox', 'jump', 'dog', 'foo', 'bar']
'the quick brown foxes jumped over the lazy dogs'
yields
['fox','jump','dog']
I appreciate that I can do boolean matching, but I can't work out if I can return the matched keywords, as opposed to a highlighted input snippet.
I'd like to use FTS so i can benefit from finding some synonyms etc.
Is this possible in Postgres?
Best Answer
You have two components of this problem
ts_query
,Now first and foremost, this is FTS. All words in the search string get tokenized as a
ts_query
. From that point on the original search string is lost. You no longer have words you have lexemes -- tokens which have been stemmed and normalized.Getting your array to a
ts_query
,Essentially, a
ts_query
looks like,'fox' & 'jump' & 'dog' & 'foo' & 'bar'
You should reviewplainto_tsquery
which does slightly more voodoo to gettext
to atsquery
, andto_tsquery
which assumes the string is already properly formatted.To get the array to a
ts_query
, you can do something like thisHighlighting
To get highlighting working you need something like this,
That returns
There are quite a few options on
ts_headline
. All of the docs for this are found in Controlling Text SearchGetting out the words
The only other thing you can do here is to pull out the words which triggered those tokens