Postgresql – Fuzzy Matching with Postgresql 9.3

full-text-searchpostgresql

Right now I am using ts_rank (with ts_vector and ts_query) to search strings for relevance. While this is doing a great job for me so far. I was hoping to work in a bit of fuzzieness into the searching.

As it stands the search is done by a user entering their terms as a string, that I then parse into tokens. These tokens are then compared against a string. What I would like to do is use something similar to a levenshtein in these queries to allow for minor misspellings to be matched as well.

For example if the user typed in rubico instead of rubicon I would still get a match on it (because rubico only has a distance of 1 from rubicon).

Is this possible to do with Postgres' ts_rank functionality, or are there other options to allow for text searches to work with levenstheins?

Best Answer

Instead of matching the portion of the string that I wanted to fuzzy search on I instead, concatinated all of the terms into a string, and then ran the difference function (from the fuzzystrmatch module) against it. So it looked something like this.

FOR term IN terms LOOP
    term_string := term_string||' '||term;
END LOOP;

score := score + difference(term_string,string_to_compare_against);

I did need to set a threshold that needed to be passed by the score to keep from getting to many false positives.