Indexing Subkey Inside Subkey in JSONB in PostgreSQL

indexindex-tuningjsonpostgresql

I have a jsonb field registration that contains an array of objects. I need to create two separates indexes to look for id, by example:

d61802ff-3eec-4a72-97ca-832f51b96bf0 = [ .tour_id.id ]
f492b26d-d260-4bcb-8e96-6532fdf38665 = [ .customer_id.id ]

Example data (I have more keys and arrays, but it's a resume)

[
   {
    "tour_id":
      {
        "id":"d61802ff-3eec-4a72-97ca-832f51b96bf0",
        "name":"Paris  2018"
      },
    "customer_id":
      {
        "id":"f492b26d-d260-4bcb-8e96-6532fdf38665",
        "name":"OBAMA John"
      }
   },
   {
    "tour_id":
      {
        "id":"a52d38d4-9bfb-4ffa-1122-e536f04b0c60",
        "name":"London  2018"
      },
    "customer_id":
      {
        "id":"f492b26d-d260-4bcb-8e96-6532fdf38665",
        "name":"OBAMA John"
      }
   }
]

How to index by [array].customer_id.id?
Or is there a method to index by subkey.subkey? Or deeper nested subkey?

I'm using Postgres 9.6.

Best Answer

First off, I would consider a normalized DB design instead of the convoluted jsonb column, which makes indexing and complex queries harder.

How to index by [array].customer_id.id?

See my answer to your previous question:

You could have two GIN indexes like instructed there, one on tour_id, another one on customer_id, or a single multicolumn index, and then query:

SELECT * FROM tbl
WHERE  f_extract_tour_ids(registration)     @> '{d61802ff-3eec-4a72-97ca-832f51b96bf0}'::uuid[]
AND    f_extract_customer_ids(registration) @> '{f492b26d-d260-4bcb-8e96-6532fdf38665}'::uuid[];

Note, this returns rows with tour_id and customer_id matching in any top-level object of the array, not necessarily in the same. If you want to find rows where tour_id and customer_id match in the same object, you have to do more ...

jsonb_path_ops index

However, with a growing number of different queries (requiring multiple special indexes), a single generic (much bigger) jsonb_path_ops index on the whole column becomes more competitive. Or even one with the default operator class operator class jsonb_ops - which is bigger, yet.

CREATE INDEX foo ON tbl USING GIN (registration jsonb_path_ops);

Queries to go with it:

For rows with tour_id and customer_id matching in the same top-level object:

SELECT * FROM tbl
WHERE  registration @> '[{"tour_id": {"id": "d61802ff-3eec-4a72-97ca-832f51b96bf0"}
                        , "customer_id": {"id": "f492b26d-d260-4bcb-8e96-6532fdf38665"}}]';

For rows with tour_id and customer_id matching in any top-level object, not necessarily the same:

SELECT * FROM tbl
WHERE  registration @> '[{"tour_id": {"id": "d61802ff-3eec-4a72-97ca-832f51b96bf0"}}
                       , {"customer_id":{"id":"f492b26d-d260-4bcb-8e96-6532fdf38665"}}]';

This solution is slower because of the much bigger index carrying all the other noise in your jsonb column. But it is much more versatile and simpler.
If the first solutions make your head ache, use this one.