Partition pruning with multiple date columns

data-warehousedatabase-designoracleoracle-11g-r2schema

I have a large table in Oracle 11g database that holds historical data from several years, so I would like to partition it by year. The problem is that the table has multiple date columns and they are all used in queries, so I can't just pick one date column and use it as partition key.

Most of the time dates are close to each other, so I have created partitions for each year, plus one "overflow" partition that holds the rows that cross the year boundary. Here is a simplified example:

create table t (
  start_year int,
  end_year int,
  partition_year int as (case when start_year=end_year then start_year else 0 end),
  data blob 
)
partition by range(partition_year) (
  partition poverflow values less than (1000),
  partition p2000 values less than (2001),
  partition p2001 values less than (2002),
  partition p2002 values less than (2003),
  partition p2003 values less than (2004),
  partition p2004 values less than (2005)
);

The problem with this approach is that partition_year must be explicitly referenced in queries or partition pruning (highly desirable because the table is large) doesn't take effect. This table is used for ad-hoc aggregate queries by multiple users; I can't expect that they all remember this logic.

This can be solved with a view

create or replace view v as
select *
from t
where partition_year=start_year 
  and partition_year=end_year 
  and partition_year>1000
union all
select *
from t partition (poverflow);

Now queries like this one

select * from v where start_year >= 2003 and end_year <= 2004;

Use correct partitions (5-6 + 1 in plan below):

---------------------------------------------------------------------------------------------------
| Id  | Operation                  | Name | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
---------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT           |      |     1 |  4030 |     2   (0)| 00:00:01 |       |       |
|   1 |  VIEW                      | V    |     1 |  4030 |     2   (0)| 00:00:01 |       |       |
|   2 |   UNION-ALL                |      |       |       |            |          |       |       |
|   3 |    PARTITION RANGE ITERATOR|      |     1 |  2041 |     2   (0)| 00:00:01 |     5 |     6 |
|*  4 |     TABLE ACCESS FULL      | T    |     1 |  2041 |     2   (0)| 00:00:01 |     5 |     6 |
|   5 |    PARTITION RANGE SINGLE  |      |     1 |  2041 |     2   (0)| 00:00:01 |     1 |     1 |
|*  6 |     TABLE ACCESS FULL      | T    |     1 |  2041 |     2   (0)| 00:00:01 |     1 |     1 |
---------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   4 - filter("START_YEAR">=2003 AND "END_YEAR"<=2004 AND "END_YEAR">=2003 AND 
              "START_YEAR"<=2004 AND "PARTITION_YEAR"<=2004 AND "PARTITION_YEAR"="START_YEAR" AND 
              "PARTITION_YEAR"="END_YEAR")
   6 - filter("START_YEAR">=2003 AND "END_YEAR"<=2004)

The problem is that if I replace int types with dates, this doesn't work any more. I have tried to extract the year component from dates and add corresponding constraints to the view, but partitions are not pruned. Changing the type of partition_year to date also didn't help.

Is there any way how I could have multiple date columns in a table and still be able to use partition pruning?

Best Answer

Oracle is unable to do partition pruning when a function is applied to the partitioned column. From the docs:

There are several cases when the optimizer cannot perform pruning. One common reasons is when an operator is used on top of a partitioning column. This could be an explicit operator (for example, a function) or even an implicit operator introduced by Oracle as part of the necessary data type conversion for executing the statement.

Your view has to apply some form of function to start and end dates to figure out if they're the same year or not, so I believe you're out of luck with this approach.

Our solution to a similar problem was to create materialized views over the base table, specifying different partition keys on the materialized views.

We've tailored ours to match common base queries so that we get query rewrite benefits as well. You may need to get users to use the MVs directly to ensure you get the partition pruning working as you need, rather than relying on query rewrite.

(Updated to remove incorrect example and add info regarding applying functions to partition columns)