Postgresql – Postgres condition pushdown when using range conditions (>=, <= or BETWEEN)

explainoptimizationpostgresql

I want to understand better the circumstances under which Postgres query executioner pushes conditions down
into the join components. I would be especially grateful for references to source code parts responsible for this.

Let's say that you have a view the body of which is a join between two tables on a date field,
like this (this is a trivial made up example with no real-life interpretation):

Schema (PostgreSQL v13)

create table measures (mdate date, measurement numeric);
create index measures_mdate on measures(mdate);

create view name_matches_surname as 
   select m1.*, m2.measurement as the_other_measurement 
   from measures m1 left join measures m2 on (m1.mdate = m2.mdate);
            
insert into measures values ('2021-05-11',1.0), ('2021-05-11',2.0);

If I select from this view with an equality condition on the date field, the condition would be pushed down into both sides of the join, as could be seen here:

Query #1: (= 'now') is pushed down

explain select * from name_matches_surname where mdate='now'::date;
| QUERY PLAN                                                                              |
| --------------------------------------------------------------------------------------- |
| Nested Loop Left Join  (cost=8.40..27.89 rows=36 width=68)                              |
|   Join Filter: (m1.mdate = m2.mdate)                                                    |
|   ->  Bitmap Heap Scan on measures m1  (cost=4.20..13.67 rows=6 width=36)               |
|         Recheck Cond: (mdate = '2021-05-12'::date)                                      |
|         ->  Bitmap Index Scan on measures_mdate  (cost=0.00..4.20 rows=6 width=0)       |
|               Index Cond: (mdate = '2021-05-12'::date)                                  |
|   ->  Materialize  (cost=4.20..13.70 rows=6 width=36)                                   |
|         ->  Bitmap Heap Scan on measures m2  (cost=4.20..13.67 rows=6 width=36)         |
|               Recheck Cond: (mdate = '2021-05-12'::date)                                |
|               ->  Bitmap Index Scan on measures_mdate  (cost=0.00..4.20 rows=6 width=0) |
|                     Index Cond: (mdate = '2021-05-12'::date)                            |

As you can see, both sides of the join are index scans with condition (='now') being used to access the index. So far so good. Lets try range condition (BETWEEN) instead:

Query #2: BETWEEN is pushed down into one side of the join only

explain select * from name_matches_surname 
  where mdate between 'now'::date-5 and 'now'::date;
| QUERY PLAN                                                                                        |
| ------------------------------------------------------------------------------------------------- |
| Hash Right Join  (cost=13.77..41.61 rows=38 width=68)                                             |
|   Hash Cond: (m2.mdate = m1.mdate)                                                                |
|   ->  Seq Scan on measures m2  (cost=0.00..22.70 rows=1270 width=36)                              |
|   ->  Hash  (cost=13.70..13.70 rows=6 width=36)                                                   |
|         ->  Bitmap Heap Scan on measures m1  (cost=4.21..13.70 rows=6 width=36)                   |
|               Recheck Cond: ((mdate >= '2021-05-07'::date) AND (mdate <= '2021-05-12'::date))     |
|               ->  Bitmap Index Scan on measures_mdate  (cost=0.00..4.21 rows=6 width=0)           |
|                     Index Cond: ((mdate >= '2021-05-07'::date) AND (mdate <= '2021-05-12'::date)) |

As you can see, BETWEEN was pushed down into the left side of the join only, and for the right side, we have Seq Scan. You can easily verify that type of join (inner or left) or size of the table have no bearing on the outcome of pushdown – the condition will not be pushed down into the right side of the join.

As far as I can see, this behaviour is consistent for all versions between Postgres 9.5 and 13.

What is the reason for this? Is it possible to rewrite the view in a different way so that BETWEEN would be pushed down "equally well", like (=)? Are there any references to the documentation or source code that explain the pushdown strategies in depth?


View on DB Fiddle

Best Answer

With =, the optimizer deduces that if m1.mdate = m2.mdate and m1.mdate = 'constant' , the same must hold for m2.mdate. No such inference takes place for inequality operators.