Postgresql – Index for a complex similarity query Postgres

postgresqlpostgresql-9.6postgresql-performance

I have a materialized view that has a few columns that I need to run a complex OR query on with similarity matching and the structure is like this

mymatview:

column a,
column b,
column c,
column d,
column e

and on column D (which is a TEXT field) I run a query like this

SELECT * FROM mymatview   
WHERE ((LOWER(d) = <<some value>> ) OR (LENGTH(d) > 4 AND d % <<some value>>))  
    OR ((LOWER(d) = <<some other value>> ) OR (LENGTH(d) > 4 AND d % <<some other value>>))

Currently the matview has 1 mil+ rows in it and it runs a sequence scan on that table using the filter above with the OR conditionals. Does anyone know what a good index would be on column d that would help with performance on running a query like this?

UPDATE: I am using the package pg_trgm for the '%' similarity operator here. Postgres 9.6 https://www.postgresql.org/docs/9.6/pgtrgm.html

Best Answer

Your only chance is to index all possible conditions, perhaps PostgreSQL will use bitmap operations to cover the ORs. Perhaps it will think that a sequential scan is cheaper and go for that.

CREATE INDEX ON mymatview (lower(d));

CREATE INDEX ON mymatview USING gin (d gin_trgm_ops) WHERE length(d) > 4;