I'm attempting to use JSONB
with JDBC, which means that I have to avoid any of the operators which use the '?' character (as the PostgreSQL JDBC driver has no escaping for this character). Taking a simple table:
CREATE TABLE jsonthings(d JSONB NOT NULL);
INSERT INTO jsonthings VALUES
('{"name":"First","tags":["foo"]}')
, ('{"name":"Second","tags":["foo","bar"]}')
, ('{"name":"Third","tags":["bar","baz"]}')
, ('{"name":"Fourth","tags":["baz"]}');
CREATE INDEX idx_jsonthings_name ON jsonthings USING GIN ((d->'name'));
Using the command line I can run a simple select and it uses the index as expected:
EXPLAIN ANALYZE SELECT d FROM jsonthings WHERE d->'name' ? 'First';
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on jsonthings (cost=113.50..30236.13 rows=10000 width=61) (actual time=0.024..0.025 rows=1 loops=1)
Recheck Cond: ((d -> 'name'::text) ? 'First'::text)
Heap Blocks: exact=1
-> Bitmap Index Scan on idx_jsonthings_name (cost=0.00..111.00 rows=10000 width=0) (actual time=0.015..0.015 rows=1 loops=1)
Index Cond: ((d -> 'name'::text) ? 'First'::text)
Planning time: 0.073 ms
Execution time: 0.047 ms
(7 rows)
Since I cannot use the ?
character I resorted to using a function which underpins the ?
operator. However, it is not using the index:
EXPLAIN ANALYZE SELECT d FROM jsonthings WHERE jsonb_exists(d->'name','First');
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------
Seq Scan on jsonthings (cost=10000000000.00..10000263637.06 rows=3333334 width=61) (actual time=0.016..3135.119 rows=1 loops=1)
Filter: jsonb_exists((d -> 'name'::text), 'First'::text)
Rows Removed by Filter: 10000003
Planning time: 0.051 ms
Execution time: 3135.138 ms
(5 rows)
Why is this happening, and what can I do to make the function use the index? Note that in reality the table has another 10MM rows in it and I also have enable_seqscan
turned off so this isn't a case of the planner deciding not to use the index.
In response to a comment I tried using a custom operator instead:
CREATE OPERATOR ### (
PROCEDURE = jsonb_exists,
LEFTARG = jsonb,
RIGHTARG = text,
RESTRICT = contsel,
JOIN = contjoinsel);
But that has the same problem:
EXPLAIN ANALYZE SELECT d FROM jsonthings WHERE d->'name' ### 'First';
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------
Seq Scan on jsonthings (cost=10000000000.00..10000263637.06 rows=10000 width=61) (actual time=0.012..3381.608 rows=1 loops=1)
Filter: ((d -> 'name'::text) ### 'First'::text)
Rows Removed by Filter: 10000003
Planning time: 0.046 ms
Execution time: 3381.623 ms
(5 rows)
Update
The latest PostgreSql driver (as of March 2015) has the ability to escape the ?
character so this specific case is no longer a problem.
Best Answer
Supported operators and workaround
The default operator class for GIN indexes on
json
columnsjsonb_ops
only supports these operators (per documentaion):You can achieve this the other way round: Create a simple
IMMUTABLE
SQL function using the?
operator, which can be inlined and will use the index just like the operator itself:This works, I tested in Postgres 9.4 ...
Misunderstandings
However, you have been asking the wrong question. There are two basic misconceptions in your question.
The
jsonb
operator?
cannot be used to search for values. Only for keys or array elements. The manual:You got the wrong operator, the
WHERE
condition cannot work:WHERE d->'name' ? 'First'
The expression index you have doesn't makes sense either way
The expression
d->'name'
returns ajsonb
value. You would needd
->>
'name'
to get the value astext
.But that would still be pointless. Since the value of the
name
key is a simple string, a GIN index (while possible) makes no sense to begin with.Solutions
You don't need the operator
?
- so no workaround either.Here are two ways that would actually work:
Plain GIN index on
d
and use the "contains" operator@>
:You could even use the more specialized operator class
jsonb_path_ops
. See:B-tree expression index on
d->>'email'
and test with good old=
:The second index would be considerably smaller and the query faster.