PostgreSQL operator uses index but underlying function does not

index-tuningjsonoperatorpostgresqlpostgresql-9.4

I'm attempting to use JSONB with JDBC, which means that I have to avoid any of the operators which use the '?' character (as the PostgreSQL JDBC driver has no escaping for this character). Taking a simple table:

CREATE TABLE jsonthings(d JSONB NOT NULL);
INSERT INTO jsonthings VALUES
    ('{"name":"First","tags":["foo"]}')
  , ('{"name":"Second","tags":["foo","bar"]}')
  , ('{"name":"Third","tags":["bar","baz"]}')
  , ('{"name":"Fourth","tags":["baz"]}');
CREATE INDEX idx_jsonthings_name ON jsonthings USING GIN ((d->'name'));

Using the command line I can run a simple select and it uses the index as expected:

EXPLAIN ANALYZE SELECT d FROM jsonthings WHERE d->'name' ? 'First';

                                                            QUERY PLAN                                                            
----------------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on jsonthings  (cost=113.50..30236.13 rows=10000 width=61) (actual time=0.024..0.025 rows=1 loops=1)
   Recheck Cond: ((d -> 'name'::text) ? 'First'::text)
   Heap Blocks: exact=1
   ->  Bitmap Index Scan on idx_jsonthings_name  (cost=0.00..111.00 rows=10000 width=0) (actual time=0.015..0.015 rows=1 loops=1)
         Index Cond: ((d -> 'name'::text) ? 'First'::text)
 Planning time: 0.073 ms
 Execution time: 0.047 ms
(7 rows)

Since I cannot use the ? character I resorted to using a function which underpins the ? operator. However, it is not using the index:

EXPLAIN ANALYZE SELECT d FROM jsonthings WHERE jsonb_exists(d->'name','First');
                                                            QUERY PLAN                                                            
----------------------------------------------------------------------------------------------------------------------------------
 Seq Scan on jsonthings  (cost=10000000000.00..10000263637.06 rows=3333334 width=61) (actual time=0.016..3135.119 rows=1 loops=1)
   Filter: jsonb_exists((d -> 'name'::text), 'First'::text)
   Rows Removed by Filter: 10000003
 Planning time: 0.051 ms
 Execution time: 3135.138 ms
(5 rows)

Why is this happening, and what can I do to make the function use the index? Note that in reality the table has another 10MM rows in it and I also have enable_seqscan turned off so this isn't a case of the planner deciding not to use the index.

In response to a comment I tried using a custom operator instead:

CREATE OPERATOR ### (
  PROCEDURE = jsonb_exists,
  LEFTARG = jsonb,
  RIGHTARG = text,
  RESTRICT = contsel,
  JOIN = contjoinsel);

But that has the same problem:

EXPLAIN ANALYZE SELECT d FROM jsonthings WHERE d->'name' ### 'First';
                                                           QUERY PLAN                                                           
--------------------------------------------------------------------------------------------------------------------------------
 Seq Scan on jsonthings  (cost=10000000000.00..10000263637.06 rows=10000 width=61) (actual time=0.012..3381.608 rows=1 loops=1)
   Filter: ((d -> 'name'::text) ### 'First'::text)
   Rows Removed by Filter: 10000003
 Planning time: 0.046 ms
 Execution time: 3381.623 ms
(5 rows)

Update

The latest PostgreSql driver (as of March 2015) has the ability to escape the ? character so this specific case is no longer a problem.

Best Answer

Supported operators and workaround

The default operator class for GIN indexes on json columns jsonb_ops only supports these operators (per documentaion):

Name        Indexed Data Type   Indexable Operators
...
jsonb_ops   jsonb               ? ?& ?| @>

You can achieve this the other way round: Create a simple IMMUTABLE SQL function using the ? operator, which can be inlined and will use the index just like the operator itself:

CREATE OR REPLACE FUNCTION jb_contains(jsonb, text)
  RETURNS bool AS
'SELECT $1 ? $2' LANGUAGE sql IMMUTABLE;

This works, I tested in Postgres 9.4 ...

Misunderstandings

However, you have been asking the wrong question. There are two basic misconceptions in your question.

  1. The jsonb operator ? cannot be used to search for values. Only for keys or array elements. The manual:

    Description:
    Does the string exist as a top-level key within the JSON value?

    You got the wrong operator, the WHERE condition cannot work: WHERE d->'name' ? 'First'

  2. The expression index you have doesn't makes sense either way

    CREATE INDEX idx_jsonthings_name ON jsonthings USING GIN ((d->'name'));
    • The expression d->'name' returns a jsonb value. You would need d->>'name' to get the value as text.

    • But that would still be pointless. Since the value of the name key is a simple string, a GIN index (while possible) makes no sense to begin with.

Solutions

You don't need the operator ? - so no workaround either.
Here are two ways that would actually work:

  1. Plain GIN index on d and use the "contains" operator @>:

    CREATE INDEX idx_jsonthings_d_gin ON jsonthings USING GIN (d);
    
    SELECT d FROM jsonthings WHERE d @> '{"name":"First"}'
    

    You could even use the more specialized operator class jsonb_path_ops. See:

  2. B-tree expression index on d->>'email' and test with good old =:

    CREATE INDEX idx_jsonthings_d_email ON jsonthings ((d->>'email'));
    
    SELECT d FROM jsonthings WHERE d->>'email' = 'First';
    

The second index would be considerably smaller and the query faster.