Postgresql – Understanding Postgres query planner behaviour on GIN index

indexpostgresqlpostgresql-performance

Need your expert opinion on index usage and query planner behaviour.

\d orders
                                         Partitioned table "public.orders"
           Column            |           Type           | Collation | Nullable |                   Default
-----------------------------+--------------------------+-----------+----------+----------------------------------------------
 oid                         | character varying        |           | not null |
 user_id                     | character varying        |           | not null |
 tags                        | text[]                   |           | not null |
 category                    | character varying        |           |          |
 description                 | character varying        |           |          |
 order_timestamp             | timestamp with time zone |           | not null |
 ...
Partition key: RANGE (order_timestamp)
Indexes:
    "orders_uid_country_ot_idx" btree (user_id, country, order_timestamp)
    "orders_uid_country_cat_ot_idx" btree (user_id, country, category, order_timestamp desc)
    "orders_uid_country_cat_tag_gin_idx" gin (user_id, country, category, tags) WITH (fastupdate=off)
    "orders_uid_oid_ot_key" UNIQUE CONSTRAINT, btree (user_id, oid, order_timestamp)

The table contains millions of rows.

I have observed the following behaviour based on query param
when I run the following query:

select * from orders
where user_id = 'u1'
and country = 'c1'
and tags && '{t1}'
and order_timestamp >= '2021-01-01 00:00:00+00'
and order_timestamp <  '2021-03-25 05:45:47+00'
order by order_timestamp desc
limit 10 offset 0;

case 1

for records with t1 tags where t1 tags occupies 99% of the records for user u1, 1st index orders_uid_country_ot_idx is picked up:

Limit  (cost=0.70..88.97 rows=21 width=712) (actual time=1.967..12.608 rows=21 loops=1)
   ->  Index Scan Backward using orders_y2021_jan_to_uid_country_ot_idx on orders_y2021_jan_to_jun orders  (cost=0.70..1232.35 rows=293 width=712) (actual time=1.966..12.604 rows=21 loops=1)
         Index Cond: (((user_id)::text = 'u1'::text) AND ((country)::text = 'c1'::text) AND (order_timestamp >= '2021-01-01 00:00:00+00'::timestamp with time zone) AND (order_timestamp < '2021-03-25 05:45:47+00'::timestamp with time zone))
         Filter: (tags && '{t1}'::text[])
 Planning Time: 0.194 ms
 Execution Time: 12.628 ms

case 2

But when I query for tags value t2 with something like tags && '{t2}' and it is present in 0 to < 3% of rows for a user, gin index is picked up:

Limit  (cost=108.36..108.38 rows=7 width=712) (actual time=37.822..37.824 rows=0 loops=1)
   ->  Sort  (cost=108.36..108.38 rows=7 width=712) (actual time=37.820..37.821 rows=0 loops=1)
         Sort Key: orders.order_timestamp DESC
         Sort Method: quicksort  Memory: 25kB
         ->  Bitmap Heap Scan on orders_y2021_jan_to_jun orders  (cost=76.10..108.26 rows=7 width=712) (actual time=37.815..37.816 rows=0 loops=1)
               Recheck Cond: (((user_id)::text = 'u1'::text) AND ((country)::text = 'ID'::text) AND (tags && '{t2}'::text[]))
               Filter: ((order_timestamp >= '2021-01-01 00:00:00+00'::timestamp with time zone) AND (order_timestamp < '2021-03-25 05:45:47+00'::timestamp with time zone))
               ->  Bitmap Index Scan on orders_y2021_jan_to_uid_country_cat_tag_gin_idx  (cost=0.00..76.10 rows=8 width=0) (actual time=37.812..37.812 rows=0 loops=1)
                     Index Cond: (((user_id)::text = 'u1'::text) AND ((country)::text = 'c1'::text) AND (tags && '{t2}'::text[]))
 Planning Time: 0.190 ms
 Execution Time: 37.935 ms
  1. Is this because the query planner identifies that since 99% of rows are covered in case 1, it skips the gin index and directly uses the 1st index? If so, does Postgres identify this based on the stats?

  2. Before gin index creation, when 1st index is picked for case 2, performance was very bad since index access range is high. i.e number of rows that satisfy the condition of user id, country and time column is very high. gin index improved it but I’m curious to understand how Postgres chooses it selectively.

Below is the explain analyze buffers result by explicitly disabling bitmap heapscan

Limit  (cost=0.70..408.53 rows=7 width=712) (actual time=496.142..496.143 rows=0 loops=1)
   Buffers: shared hit=29 read=727
   ->  Index Scan Backward using orders_y2021_jan_to_user_id_country_ot_idx on orders_y2021_jan_to_jun  (cost=0.70..408.53 rows=7 width=712) (actual time=496.140..496.140 rows=0 loops=1)
         Index Cond: (((user_id)::text = 'u1'::text) AND ((country)::text = 'c1'::text) AND (order_timestamp >= '2021-01-01 00:00:00+00'::timestamp with time zone) AND (order_timestamp < '2021-03-25 05:45:47+00'::timestamp with time zone))
         Filter: (tags && '{t2}'::text[])
         Rows Removed by Filter: 752
         Buffers: shared hit=29 read=727
 Planning Time: 0.195 ms
 Execution Time: 496.167 ms
  1. orders_uid_country_cat_ot_idx was added to support filter by category since when gin index was used when filtered by just category or by both category and tags, the performance was bad compared to when the btree index on (user_id, country, category, order_timestamp) is picked up. I expected gin index to work well for all the combination of category and tags filter. What could be the reason?

Best Answer

It is not so common that we get questions about why the planner does such a good job!

By looking at expected rows values given in the Index Scan Backwards and in the Bitmap Heap Scan (or equivalently the Sort) nodes, we see that for one it expects to find 293 rows, and for the other it expects 7. Yes, this difference does come from the stats. Specifically, tags's contribution to it, being an array column, comes from the values in select most_common_elems, most_common_elem_freqs from pg_stats where tablename='orders' and attname='tags'. The selectivity derived from that then gets multiplied by the selectivities of the 3 other columns to get the overall selectivity, which is multiplied by the estimated number of rows in the table to get the estimated row count.

For your question 2, to get a definitive answer you should show us the EXPLAIN (ANALYZE, BUFFERS) for the query when the gin index is not there. Scanning the full complement of 293 rows it expects shouldn't take very long, so maybe that part of the estimate is way off and there are really a lot more rows. Since dropping an index can be hassle, you could instead set_enable_bitmapscan=off in order to force it not to use the gin index. An important part of the expected performance of the 1st plan is the prospect of stopping early. It gets to stop after reading only 20 of the (expected to be) 293 rows. But if there will only be 7, it will not get to stop early. It will have to read all 7 of those, plus however many it has to read and then filter out by the tags criterion before it finds those 7.

Updating to reflect our new plan: That performance wasn't as bad as I was assuming it to be based on your description of 'very bad' (but yes it is still substantially worse than when using the GIN index). We don't know how many rows it was expecting to find in the index and then filter out using tags, as EXPLAIN does not give that kind of detail. But I assume it was expecting around 293, which is not all that far away from 752 as far as complex planning goes. If you ran EXPLAIN on the same query but without the LIMIT or the tags filter, it would clarify what it was really expecting, although I doubt it would add much value here. But reading and filtering out those 752 did require the reading of 727 pages, which IO is presumably where most of your time goes (turning track_io_timing on would verify that). If the table were better clustered on the index orders_y2021_jan_to_uid_country_ot_idx, then reading those 752 rows would be much faster, and you would probably no longer benefit from the gin index. But, this observation is probably not of much value as the gin index works well and clustering tables is annoying. It looks like the PostgreSQL planner did at least partially understand the slowness from all the random IO, and that is why it chose the other plan which turns out to be a good choice.

One thing I would note is that it would probably be better to have the GIN index just on one column, "tags". Then it would probably combine the GIN index and the orders_y2021_jan_to_uid_country_ot_idx index with a BitmapAnd, which should be more efficient than just using the GIN index. (The planner should probably make that choice anyway even with the existing indexes--I don't know why it doesn't, that seems to be some weakness in the cost estimation process)

For your question 3, you don't show us any GIN indexes containing category and tags. If you don't have one, it is not surprising it won't be used. Anyway, it seems like a fundamentally different question, you should ask it as a separate question, and show the EXPLAIN (ANALYZE, BUFFERS) for the query that underperformed your expectations.