Postgresql – query behaviour of composite gin index

indexpostgresqlpostgresql-9.6

CREATE   TABLE products(  
    products_id serial,
    location_id integer, --(not unique)
    idle_products jsonb --has 1000 elements ('[13,45,8976.....]'::jsonb)
);

CREATE EXTENSION btree_gin;
CREATE INDEX idx ON  products USING GIN(location_id, idle_products);

Take just for example;

products table have millions of rows and 1000 elements in every row in idle_products column.

SELECT * FROM products
WHERE   location_id IN (SELECT * FROM unnest(array[1245,456,6342,2,23453]::int[])) AND
        idle_products @> to_jsonb(222) ;

On that query what i want to learn is:

Is idle_productsbeing searched through only these 5 rows by composite gin index and if so, does that mean much more performance gain(because there are only 5×1000 elements in 5 rows in idle_products column,
i mean it would only search through 5 rows using gin composite index) or is idle_products being searched through all rows in products table (despite filtering against location_id) and then reduce(limit) rows to 5
(array[1245,456,6342,2,23453]::int[] has 5 five location_id) by filtering location_id values?

Best Answer

If you want to see how the query will be run, you should put "EXPLAIN" before the "SELECT" and it will give you the execution plan.

There are three ways it could use the gin index:

  1. It could look up just the location_id that you specify, then fetch those rows and filter out ones not containing 222.
  2. It could look up just the idle_products you specify, then fetch those rows and filter out ones not containing one of your location_id list.
  3. It could look up all the index entries for 222, and all the index entries for your location_id list, and take the intersection of those to go fetch from the table.

It will choose which of those it thinks will work better, based on what it knows about your data distribution.

Also, I don't find composite gin indexes to be useful. Having separate indexes is more flexible, and PostgreSQL can still combine them (using BitmapAnd) when that is useful to do. I'd replace the composite index with a ordinary btree index on location_id and a gin index on the array column.