Postgresql – the alternative of the combination of two pattern matching conditions

postgresqlpostgresql-11

So what i want to do is to search in my table data for elements that in their name exists the word first and the word second. These two words can be seperated or attached to each others. A translation of that would be the following :

SELECT name FROM data WHERE name ILIKE '%first%' AND name ILIKE '%second%'

I searched a bit for an alternative for this query and how i can combine the condition in a regular expression but no luck.

My concern is performance problems if this query were to be performed over a large table. I already created the following index (which sped up considerably the search) :

CREATE INDEX pattern_matching_index ON data USING GIN (name gin_trgm_ops); 

Is there a better way to perform this kind of search ? or is this as good as it's going to get ?

Best Answer

If first and second can be any value and could be embedded in other words without any kind of delimiter (eg, foofirstbar), then yes, using trigrams is probably as good as you're going to get. If there are a limited number of values for first and second, you could create an expression index of the name column passed through a regexp_replace to add spaces around your target words, and then use the built-in full-text-search functionality:

testdb=# create table data(t text);
CREATE TABLE
testdb=# insert into data select 'foofirstbarsecondbaz';
INSERT 0 1
testdb=# insert into data select 'foobarsecondbaz';
INSERT 0 1
testdb=# insert into data select 'foosecondbarfirstbaz';
INSERT 0 1
testdb=# select t from data where 'first & second'::tsquery @@ regexp_replace(t, '(first|second)', ' \1 ', 'g')::tsvector;
          t           
----------------------
 foofirstbarsecondbaz
 foosecondbarfirstbaz
(2 rows)