Postgresql – Containment check tstzrange @> timestamptz not using btree or gist index

btreegist-indexindexpostgresqlpostgresql-11

Schema:

        Column        |           Type           
----------------------+--------------------------
 id                   | integer                  
 event_id             | integer                  
 started_at           | timestamp with time zone 
 ended_at             | timestamp with time zone 
 created_at           | timestamp with time zone 
    "event_seat_state_lookup_pkey" PRIMARY KEY, btree (id)
    "event_seating_lookup_created_at_idx" btree (created_at)
    "event_seating_lookup_created_at_idx2" gist (created_at)

Query:

SELECT id
FROM event_seating_lookup esl1
WHERE
  tstzrange(now() - interval '1 hour', now() + interval '1 hour', '[)') @> esl1.created_at;

Explain analyze:

Table with <100k rows.

 Seq Scan on event_seating_lookup esl1  (cost=0.00..1550.30 rows=148 width=4) (actual time=0.013..19.956 rows=29103 loops=1)
   Filter: (tstzrange((now() - '01:00:00'::interval), (now() + '01:00:00'::interval), '[)'::text) @> created_at)
   Buffers: shared hit=809
 Planning Time: 0.110 ms
 Execution Time: 21.942 ms

Table with 1M+ rows:

Seq Scan on event_seating_lookup esl1  (cost=10000000000.00..10000042152.75 rows=5832 width=4) (actual time=0.009..621.895 rows=1166413 loops=1)
  Filter: (tstzrange((now() - '01:00:00'::interval), (now() + '01:00:00'::interval), '[)'::text) @> created_at)
  Buffers: shared hit=12995
Planning Time: 0.092 ms
Execution Time: 697.927 ms

I have tried:

VACUUM FULL event_seating_lookup;
VACUUM event_seating_lookup;
VACUUM ANALYZE event_seating_lookup;
SET enable_seqscan = OFF;

Problem:

event_seating_lookup_created_at_idx or event_seating_lookup_created_at_idx2 indexes are not being used.

Notes:

  • PostgreSQL 11.1.
  • btree_gist extension is installed.
  • I have tried equivalent setup with created_at timestamp without time zone and using tsrange; same result.
  • I understand that rewriting the query with >=, < checks would make it use the btree index. The question is what is the reason the index is not used with the tstzrange containment operator and if there is a way to make it work.

Best Answer

The question is what is the reason the index is not used with the tstzrange containment operator and if there is a way to make it work.

The reason is quite trivial. B-tree indexes do no support the containment operator @>. Neither for range types like tstzrange nor for any other type (including array types).

The manual:

... a btree operator class must provide five comparison operators, <, <=, =, >= and >.

And GiST indexes do not support <, <=, > =, >= and >. See these chapters of the manual:

In Postgres indexes are bound to operators (which are implemented for certain types), not data types alone or functions or anything else. Related:

The GiST index event_seating_lookup_created_at_idx2 you have is pointless. It is created on the timestamptz column created_at. Such a GiST index would be useful for a range type (the opposite direction of logic).

Creating a GiST index on a timestamptz column is only possible since you installed the additional btree_gist extension to allow such useless indexes. (There are useful applications for multicolumn indexes or exclusion constraints ...) In stock Postgres you'd get an error:

ERROR: data type timestamp with time zone has no default operator class for access method "gist"

So while it's logically valid and technically possible to use a btree index (or a GiST index) for the query, the case is not implemented: no index support for timestamptz <@ tstzrange (where timestamptz would be the indexed expression!). It can be solved with <, <=, >, >= more efficiently. So I guess no developer felt (or will feel) the need to implement it.

Function to rewrite expression with favorable operators

Your implementation makes sense to me. It does what you want and due to function inlining makes use of a plain btree index on the timestamp column - event_seating_lookup_created_at_idx in your example. For calls with a constant range (like in a single function call) I suggest this modified version:

CREATE OR REPLACE FUNCTION in_range(timestamptz, tstzrange)
  RETURNS boolean AS
$func$
SELECT CASE WHEN lower_inc($2) THEN $1 >= lower($2) ELSE $1 > lower($2) END
   AND CASE WHEN upper_inc($2) THEN $1 <= upper($2) ELSE $1 < upper($2) END
$func$  LANGUAGE sql IMMUTABLE;

Declare it IMMUTABLE, because it actually is. Not to help function inlining (can even prevent it if the declaration is false) but for other gains. Related:

It can be inlined and uses the index just like your version. The difference: this one suppresses a redundant index condition for exclusive bounds. Fortunately, your consideration in this regard is slightly off target:

The reason for first matching on the $1 >= lower($2) AND $1 <= upper($2) condition and then checking the upper_inc and lower_inc constraints is to benefit from range scan first and then filter the result.

Postgres 11 (at least) is even smarter than that. I see no Filter step for your version. For default [) bounds (like in your example), I get this query plan (line breaks in conditions added by me):

  ->  Index Only Scan using foo_idx on foo (actual rows=5206 loops=1)
        Index Cond: ((datetime >= '2018-09-05 22:00:00+00'::timestamp with time zone)
                 AND (datetime <= '2018-09-05 22:30:00+00'::timestamp with time zone)
                 AND (datetime < '2018-09-05 22:30:00+00'::timestamp with time zone))

An actual Filter step could add a more substantial cost for corner cases with many hits on an excluded bound. Those would be fetched from the index and then discarded. Quite relevant for time ranges where values often end up on bounds - like timestamps on the full hour.

The actual difference is see is minor, but why not take it?

  ->  Index Only Scan using foo_idx on foo (actual rows=5206 loops=1)
        Index Cond: ((datetime >= '2018-09-05 22:00:00+00'::timestamp with time zone)
                 AND (datetime < '2018-09-05 22:30:00+00'::timestamp with time zone))