Postgresql – Functional Index LOWER PostgreSQL not working

case sensitiveindexpostgresql

The below SQL query runs in 2s:

SELECT
    *
FROM
    fb_name
    INNER JOIN fb_kg ON mid = subject_mid
    WHERE
        LOWER(alias) = LOWER('Polaski')
        AND fb_kg.relation = 'location/location/containedby'

The below SQL query runs in 2ms:

SELECT
    *
FROM
    fb_name
    INNER JOIN fb_kg ON mid = subject_mid
    WHERE
        alias = 'Polaski'
        AND fb_kg.relation = 'location/location/containedby'

The difference between the two queries is the inclusion of the LOWER function.

There has been an index created for both queries:

CREATE INDEX fb_name_lower_alias ON fb_name (lower(alias));
CREATE INDEX fb_name_mid_lower_alias_index ON fb_name (mid, lower(alias));
CREATE INDEX fb_name_alias ON fb_name (alias);
CREATE INDEX fb_name_mid_alias_index ON fb_name (mid, alias);

EXPLAIN for LOWER:

"Gather  (cost=247515.73..249133.22 rows=5708 width=70)"
"  Workers Planned: 2"
"  ->  Merge Join  (cost=246515.73..247562.42 rows=2378 width=70)"
"        Merge Cond: ((fb_kg.subject_mid)::text = (fb_name.mid)::text)"
"        ->  Sort  (cost=205352.30..205794.73 rows=176972 width=44)"
"              Sort Key: fb_kg.subject_mid"
"              ->  Parallel Bitmap Heap Scan on fb_kg  (cost=14116.24..184480.39 rows=176972 width=44)"
"                    Recheck Cond: ((relation)::text = 'location/location/containedby'::text)"
"                    ->  Bitmap Index Scan on relation_index  (cost=0.00..14010.06 rows=424733 width=0)"
"                          Index Cond: ((relation)::text = 'location/location/containedby'::text)"
"        ->  Sort  (cost=41163.43..41232.45 rows=27611 width=26)"
"              Sort Key: fb_name.mid"
"              ->  Bitmap Heap Scan on fb_name  (cost=750.54..39126.71 rows=27611 width=26)"
"                    Recheck Cond: (lower((alias)::text) = 'polaski'::text)"
"                    ->  Bitmap Index Scan on fb_name_lower_alias_varchar_pattern_ops  (cost=0.00..743.64 rows=27611 width=0)"
"                          Index Cond: (lower((alias)::text) = 'polaski'::text)"

Other EXPLAIN:

"Nested Loop  (cost=11.45..13502.52 rows=2 width=70)"
"  ->  Bitmap Heap Scan on fb_name  (cost=4.64..48.23 rows=11 width=26)"
"        Recheck Cond: ((alias)::text = 'Polaski'::text)"
"        ->  Bitmap Index Scan on fb_name_alias_index  (cost=0.00..4.64 rows=11 width=0)"
"              Index Cond: ((alias)::text = 'Polaski'::text)"
"  ->  Bitmap Heap Scan on fb_kg  (cost=6.81..1223.04 rows=8 width=44)"
"        Recheck Cond: ((subject_mid)::text = (fb_name.mid)::text)"
"        Filter: ((relation)::text = 'location/location/containedby'::text)"
"        ->  Bitmap Index Scan on subject_mid_index  (cost=0.00..6.81 rows=316 width=0)"
"              Index Cond: ((subject_mid)::text = (fb_name.mid)::text)"

CREATE TABLE definitions

CREATE TABLE fb_kg
        (object_mid varchar NOT NULL,
        relation varchar NOT NULL,
        subject_mid varchar NOT NULL,
        PRIMARY KEY(object_mid, relation, subject_mid));

CREATE TABLE fb_name
    (mid varchar NOT NULL,
    alias varchar NOT NULL,
    PRIMARY KEY(mid, alias));

How come with the same indexes the LOWER function is about 60x slower?

Best Answer

Using citext

My suggestion is to move to the citext module.

CREATE EXTENSION citext;
BEGIN;
  --out with the old
  DROP INDEX fb_name_lower_alias;
  DROP INDEX fb_name_mid_lower_alias_index;
  DROP INDEX fb_name_alias;
  DROP INDEX fb_name_mid_alias_index;

  --in with the new
  ALTER TABLE fb_name
    ALTER COLUMN alias
    SET DATA TYPE citext;

  CREATE INDEX ON fb_name(mid,alias);
  CREATE INDEX ON fb_name(alias);
  ANALYZE fb_name;
COMMIT;

Then simply stop using lower() everywhere. Your query should like this,

SELECT *
FROM fb_name
INNER JOIN fb_kg ON mid = subject_mid
WHERE alias = 'Polaski'
  AND fb_kg.relation = 'location/location/containedby';

Everything else should work the same. Paste the EXPLAIN ANALYZE of the new query.

Update on indexes, and case-sensitive searching on citext

If you need a case-sensitive query on a citext column, you just have to cast the other side to ::text, and in fact using the citext index is usually good enough. For example you can do a case-sensitive index only scan with WHERE col::citext = 'fOo'::citext AND col::text = 'fOo'; or, you could create two indexes on the citext column:

  • one that's case sensitive ON table(myCITEXT)
  • one that's case insensitive (I find that wasteful though) ON table(myCITEXT::text)