Postgresql – postgres: index to SELECT strings like ‘%foo%’;

indexindex-tuningperformancepostgresqlpostgresql-performance

I have an expensive, periodic PostgreSQL 9.3 query of the form:

SELECT * from mytable where name NOT LIKE '%foo%';

foo is actually a constant here that never changes. The query is expensive because it requires a table scan. I expect that only a few rows will actually match the query, so I want to use a partial index to speed up the query.

CREATE INDEX foo_idx ON mytable ((name NOT LIKE '%foo%')) WHERE name NOT LIKE '%foo%';

However, when I do an EXPLAIN on my query, it still is using a sequential scan instead of relying on the new index. What am I doing wrong?

I have tried several different expressions inside the parenthesized expression clause of the CREATE INDEX statement, but nothing helps:

CREATE INDEX foo_idx ON mytable (name) WHERE name NOT LIKE '%foo%';
CREATE INDEX foo_idx ON mytable ((name LIKE '%foo%')) WHERE name NOT LIKE '%foo%';

I have also tried removing the WHERE clause for the index, and that doesn't help either.

Best Answer

The index should work as is. As pointed out by @jjanes, you just needed to run ANALYZE. However, I suggest you modify the index. The expression in your definition is not useful.

CREATE INDEX foo_idx ON intentions ((data LIKE '%foo%')) WHERE data NOT LIKE '%foo%';

The expression is always FALSE and useless noise. Either simplify to the slightly cheaper equivalent index:

CREATE INDEX foo_idx ON intentions ((FALSE)) WHERE data NOT LIKE '%foo%';

Or, better yet, put something useful there. Since each index entry is going to allocate MAXALIGN (typically 8 bytes) for data anyway, and the overhead per index row is another 12 bytes (and the default fillfactor for btree indexes is 90%) you might as well put the index to good use and help other queries. If you have any other queries that might use the same partial index?

If you put something worth up to 8 bytes (like one or two int4 columns), the index retains exactly the same size and performance, but offers more possible use cases. If (one of) the column(s) is involved in updates, you have some additional maintenance cost. Still, if you have anything useful, use it. Any additional column that's used in a JOIN / WHERE or ORDER BY clause. Or column(s) that enable index-only scans:

CREATE INDEX foo_idx ON intentions (<useful_column(s)>)
WHERE data NOT LIKE '%foo%';

Related Solutions

Postgresql – Dropped an index and recreated it in PostgreSQL 9.2, not used anymore

I solved the mystery on pg irc channel - I had a long running query (idle in transaction) and they told me that a new index doesn't get used until all transactions that started before its creation don't finish.

Killed the transaction, and now the indexes are back to normal.

PostgreSQL – Why Partial Index is Not Used with Boolean in WHERE Clause

Partial indexes make sense when they match the query pattern. In your case if you would have queries with inactive IS NOT TRUE condition then the index would be used otherwise it's simply not useful.

Cannot use the partial index:

SELECT location FROM pitest WHERE name = 'Anne Lopez';

Can benefit from the index:

SELECT location FROM pitest WHERE name = 'Anne Lopez' AND inactive IS NOT TRUE;

Best Answer

Related Solutions

Postgresql – Dropped an index and recreated it in PostgreSQL 9.2, not used anymore

PostgreSQL – Why Partial Index is Not Used with Boolean in WHERE Clause

Related Question