PostgreSQL – Fetch Partitions Where Values Are Present

partitioningpostgresql

I have table partitioned by range of dates (monthly).

Having a list of dates, I want to select all the partitions, where values from the list of dates are present. That is to avoid scanning the partitions.

What would be the query to do that?

Say, I have:

partitions: calendars_y2021_m01, calendars_y2021_m02, calendars_y2021_m03
list of dates: 2021-01-01, 2021-02-01, 2021-02-02

I want to:

select partitions from calendars
where dates in (2021-01-01, 2021-02-01, 2021-02-02)

That should give me all the rows from partitions: calendars_y2021_m01, calendars_y2021_m02

The only solution I got myself is to identify the target partitions and fetch them one-by-one with python loop, but that is slow.

I am using RANGE partitioning. Selecting from calendars will scan the partitions for the specific dates. I just want to get all the data from the selected partitions into memory to avoid that.

If I select * from calendars where dates in (list of dates) PostgreSQL will first (1) define target partitions, then (2) SeqScan the partitions for the values in list of dates. I want to skip the second operation and get the entire partitions.

Once I have the partitions, I will process them in memory with python pandas. That is way faster, then seq scan. Also, I am just biased towards pythonic tools.

Best Answer

SELECT DISTINCT
       format(
          'calendars_y%s_m%s',
          lpad(CAST(extract (year FROM q) AS text), 4, '0'),
          lpad(CAST(extract (month FROM q) AS text), 2, '0'))
FROM (VALUES (DATE '2021-01-01'),
             (DATE '2021-02-01'),
             (DATE '2021-02-02')) AS q(q);

       format        
---------------------
 calendars_y2021_m01
 calendars_y2021_m02
(2 rows)

But I seriously doubt that it is a good idea to do that on foot. You will be better off letting the PostgreSQL optimizer figure out which partitions to scan.