Suppose I have the following table
data(partitioned_key_index, some_dummy_measure)
Assume that partitions are of an equal large size. with Oracle 11g.
The end result should be like this
select partitioned_key_index, sum(some_dummy_measure)
from data group by partitioned_key_index
Each partition will be grouped independently, the optimizer should be
clever enough to come up with a plan in which each partition will be aggregated then a simple 'union all' to get the desired output.
What I want to do is something close to this
select 1 as partitioned_key_index, sum(some_dummy_measure)
from data where partitioned_key_index = 1
Union All
select 2 as partitioned_key_index, sum(some_dummy_measure)
from data where partitioned_key_index = 2
Union All
.
.
.
select i as partitioned_key_index, sum(some_dummy_measure)
from data where partitioned_key_index = i
My intuition with the above method is to to serialize the hash group operation
thus each partition will be moved from the disk to the buffer cache with the hope of not spilling into disk for the group by operation.
Any ideas how to tune this kind of queries?
Best Answer
The database does this by default, no hints/tuning needed.
Then run the query:
Check what happened:
Operations below
PARTITION LIST ALL
were done for all partitions. We have 5 partitions, including theGROUP BY
. As you can see from theStarts
column,HASH GROUP BY
was really performed 5 times.This is how it looks like, when
GROUP BY
is performed for the whole table at once:HASH GROUP BY
was performend only once, for the whole amount of data, after it was collected from all partitions.