Creating PostgreSQL Operator Class for Custom Operator Index Support

indexoperatorpostgresql

I'm trying to create an index that will support queries that use my custom operator. This is on PostgreSQL 10.4.

The custom operator

I followed the tips in this SO answer to create an operator that performs "LIKE" style matching on elements in an text ARRAY.

CREATE FUNCTION reverse_like (text, text) returns boolean language sql 
as $$ select $2 like $1 $$;
    
CREATE OPERATOR <~~ ( function =reverse_like, leftarg = text, rightarg=text );

The above operator allows me to do things like

SELECT 'ab%' <~~ ANY('{"abc","def"}');

The schema, index and query

I have a table with web traffic visits called sessions which includes an array column.

CREATE TABLE sessions
(
   session_id    varchar(24) NOT NULL,
   first_seen    timestamp,
   domains       varchar[]
);

To query the domains column to see if a given domain (or partial/ wildcarded domain name) was visited I can do the following:

SELECT count(*)
FROM session_4070ba14_f081_41cb_9ef7_9dd385934da7
WHERE 'www.foo%' <~~ ANY(domains);

I want to speed up the above queries with GIN index. So I created the index as follows:

CREATE INDEX idx_domains ON session USING GIN(domains);

The Question

After running analyze on the table and a set enable_seqscan = false; I have no luck getting Postgres to employ this index. It's always doing a seqscan. It uses the above index of array operators like @> but not for my custom <~~ operator.

I think its because the GIN index doesn't know how to handle my custom operator – so do I need to create an operator class and then create my index using that? Or do I create a functional index?

Best Answer

For trigram support, you can try the parray_gin extension

WHERE domains @@> ARRAY['www.foo%'];

If you just want to do prefix matching (more efficiently than that provided by trigram), I don't think there is any way you can do that without writing some C code to glue the pieces together. I think you would then work on the array type directly, so wouldn't need the ANY, and so wouldn't benefit from the reverse_like operator at all.