Postgresql – Postgres Full Text Search With Short Word Matching

full-text-searchpostgresql

I'm using Postgres' full text search to return results, but it seems any query with 3 characters or fewer returns no results. Is there a way to configure Postgres so that I can query with 2 and 3 character long strings? My data set is extremely small, so performance would not be an issue.

For reference, here's my table definition:

CREATE TABLE doctokens (
    id serial primary key,
    inner_text TEXT NOT NULL,
    tokens tsvector
)
INSERT INTO doctokens (inner_text) VALUES ('Route builders give you a fluent syntax')

After an insert, I create the token tsvector:

UPDATE doctokens SET tokens = setweight(to_tsvector('english', COALESCE(inner_text, '')), 'A')

And here's how I'm querying the table:

SELECT ts_headline('english', inner_text, query, 'StartSel = <em>, StopSel = </em>, ShortWord = 1') as inner_text_highlights
FROM (SELECT inner_text, ts_rank_cd(tokens, query) AS rank, query
    FROM doctokens, plainto_tsquery('english', 'rou') AS query
    WHERE tokens @@ query
    ORDER BY rank DESC
    LIMIT 5) AS query_results;

Best Answer

'route' and 'routing' stem to 'rout'. Neither 'rou' nor 'routi' do. So, they don't match.

You can use the prefix match to make 'rou' match, but plainto_tsquery doesn't understand that so you have to make the tsquery yourself:

SELECT ts_headline('english', inner_text, query, 'StartSel = <em>, StopSel = </em>, ShortWord = 1') as inner_text_highlights
FROM (SELECT inner_text, ts_rank_cd(tokens, query) AS rank, query
    FROM doctokens, (select 'rou:*'::tsquery as query ) AS query
    WHERE tokens @@ query
    ORDER BY rank DESC
    LIMIT 5) AS query_results;
              inner_text_highlights               
--------------------------------------------------
 <em>Route</em> builders give you a fluent syntax
 Does <em>Routing</em> rhyme with looting?

Prefix matching won't let you match 'routi:*', because the things it would match against get the 'i' stripped off when stemming.