PostgreSQL Index – Can Postgres Index Regular Expressions Stored in a Column?

indexpostgresqlpostgresql-9.4regular expression

I've got regular expressions stored as varchars in a column, which I need to match against incoming input. For example, the table might contain:

| field |     value     |
|-------|---------------|
| email | .*@domain.com |

And a query would be:

SELECT *
FROM table
WHERE field = 'email'
  AND 'someone@domain.com' ~* value

I'll be the first to admit this is pretty silly, though it did well enough for about 2 years. The table's now up to a mindshattering 10k rows, and queries slowed to on the order of 3 seconds. I've already moved us to a much more reasonable strategy, so this question is purely academic.

If I had kept this setup, is there any way to make the lookup efficient? I was hoping for some sibling of varchar_pattern_ops, but this query is the reverse of what that solves.


With the idea down, here's the full table, query, and explain.

+------------+-----------------------------+------------------------------------------------------------+
| Column     | Type                        | Modifiers                                                  |
|------------+-----------------------------+------------------------------------------------------------|
| id         | integer                     | not null default nextval('table_id_seq'::regclass)         |
| field      | character varying(255)      | not null                                                   |
| value      | character varying(1000)     | not null                                                   |
| comment    | text                        |                                                            |
+------------+-----------------------------+------------------------------------------------------------+
Indexes:
    "table_pkey" PRIMARY KEY, btree (id)
    "index_table_on_field_and_value" UNIQUE, btree (field, value)


EXPLAIN ANALYZE
SELECT *
FROM table
WHERE (
    (field = 'contact_email' AND 'person@place.com' ~* value)
 OR (field = 'phone'         AND value = '1234567890')
 OR (field = 'unique_id'     AND value = 'abcdef')
);

Seq Scan on table (cost=0.00..613.08 rows=58 width=1) (actual time=744.371..744.371 rows=0 loops=1)
Filter: ((((field)::text = 'contact_email'::text) AND ('person@place.com'::text ~* (value)::text))
      OR (((field)::text = 'phone'::text) AND ((value)::text = '01234567890'::text))              
      OR (((field)::text = 'unique_id'::text) AND ((value)::text = 'abcdef'::text)))             
Rows Removed by Filter: 11643
Total runtime: 744.395 ms

Best Answer

Short version: no. There's no practical way (in PostgreSQL, at least) to index a pattern column so it can be matched against plaintext inputs in a way that will speed up "does this plaintext match any of these patterns" queries.

PostgreSQL would need a special custom index type that "understood" pattern matches. I'm not sure how practical it'd be to implement one, nor how much could be gained from such an index. Since there isn't one, the question is somewhat academic.