Should I group by and then join, or join and then group by

group byoracleoracle-11g-r2partitioning

(oracle's optimizer does the same thing for both approaches when just two tables are involved, thus my question involves three tables)

Assume three tables t1, t2 and t3 with a pairwise 1:n relationship from left to right:

t1 (1)--(n) t2 (1)--(n) t3

each partitioned on a common column p used in all joins and each with a primary key id, with foreign keys t2.id1 -> t1.id and t3.id2 -> t2.id.

In reality, the n (both of them) are small. Possibly about 2 t2 rows for each t1 row and about 3 t3 rows for each one from t2.

I wish to join these tables like this:

select t1.id t1id, t2.id t2id, count(1) t3count
from t1 join t2 on (t1.p = t2.p and t1.id = t2.id1)
        join t3 on (t2.p = t3.p and t2.id = t3.id2)
group by t1.id, t2.id

It gives the count of t3 rows for each t2 row and joins the t1 data to that. You could imagine getting the following output:

t1id t2id t3count
---- ---- -------
   1    1      3
   1    2      1
   1    3      2
   2    4      5
   3    5      1
   3    6      1

Now compare this query to:

with t3g as ( -- as in 't3 grouped'
    select id2, count(1) t3count
    from t3
    group by id2
)
select t1.id t1id, t2id, t3count
from t1 join t2  on (t1.p = t2.p  and t1.id = t2.id1)
        join t3g on (t2.p = t3g.p and t2.id = id2)

(This latter could have been written with a nested select instead of with).

Here I basically pre-group the t3 table, and then join it with the other two tables.

The first select gives me the following plan in oracle:

------------------------------------------------------------------------------------------------------
| Id  | Operation             | Name | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     | Pstart| Pstop |
------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |      | 69578 |  5367K|       |  2363   (1)| 00:00:01 |       |       |
|   1 |  PARTITION RANGE ALL  |      | 69578 |  5367K|       |  2363   (1)| 00:00:01 |     1 |1048575|
|   2 |   HASH GROUP BY       |      | 69578 |  5367K|  6336K|  2363   (1)| 00:00:01 |       |       |
|*  3 |    HASH JOIN          |      | 69578 |  5367K|       |  1083   (2)| 00:00:01 |       |       |
|   4 |     TABLE ACCESS FULL | t2   | 81635 |  1275K|       |   256   (1)| 00:00:01 |     1 |1048575|
|*  5 |     HASH JOIN         |      | 70033 |  4308K|       |   823   (1)| 00:00:01 |       |       |
|   6 |      TABLE ACCESS FULL| t1   | 81622 |  1753K|       |   616   (1)| 00:00:01 |     1 |1048575|
|   7 |      TABLE ACCESS FULL| t3   | 70033 |  2804K|       |   204   (1)| 00:00:01 |     1 |1048575|
------------------------------------------------------------------------------------------------------

The optimizer decides to join t1 and t3 first, otherwise, nothing unexpected.

The second query gives the following plan:

----------------------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name    | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     | Pstart| Pstop |
----------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |         | 69578 |    21M|       |  3125   (1)| 00:00:01 |       |       |
|*  1 |  HASH JOIN                   |         | 69578 |    21M|  8144K|  3125   (1)| 00:00:01 |       |       |
|   2 |   PART JOIN FILTER CREATE    | :BF0000 | 70033 |  7317K|       |   950   (1)| 00:00:01 |       |       |
|   3 |    VIEW                      |         | 70033 |  7317K|       |   950   (1)| 00:00:01 |       |       |
|   4 |     HASH GROUP BY            |         | 70033 |  2804K|  4424K|   950   (1)| 00:00:01 |       |       |
|   5 |      PARTITION RANGE ALL     |         | 70033 |  2804K|       |   204   (1)| 00:00:01 |     1 |1048575|
|   6 |       TABLE ACCESS FULL      | t3      | 70033 |  2804K|       |   204   (1)| 00:00:01 |     1 |1048575|
|   7 |   PARTITION RANGE JOIN-FILTER|         | 81622 |    17M|       |   876   (1)| 00:00:01 |:BF0000|:BF0000|
|*  8 |    HASH JOIN                 |         | 81622 |    17M|       |   876   (1)| 00:00:01 |       |       |
|   9 |     TABLE ACCESS FULL        | t2      | 81635 |  4623K|       |   256   (1)| 00:00:01 |:BF0000|:BF0000|
|  10 |     TABLE ACCESS FULL        | t1      | 81622 |    12M|       |   616   (1)| 00:00:01 |:BF0000|:BF0000|
----------------------------------------------------------------------------------------------------------------

Now here the optimizer first groups t3 as instructed, then joins t1 and t2 and then utilizes partitioning to join the grouped t3 with the joined t1-t2.

If I add a where clause limiting by t1.id, I get the following, rather similar plans:

First select:

---------------------------------------------------------------------------------------------------------------
| Id  | Operation                              | Name | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
---------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                       |      |     1 |    79 |     7  (15)| 00:00:01 |       |       |
|   1 |  HASH GROUP BY                         |      |     1 |    79 |     7  (15)| 00:00:01 |       |       |
|   2 |   NESTED LOOPS                         |      |       |       |            |          |       |       |
|   3 |    NESTED LOOPS                        |      |     1 |    79 |     6   (0)| 00:00:01 |       |       |
|   4 |     NESTED LOOPS                       |      |     1 |    63 |     5   (0)| 00:00:01 |       |       |
|   5 |      TABLE ACCESS BY GLOBAL INDEX ROWID| t1   |     1 |    22 |     3   (0)| 00:00:01 | ROWID | ROWID |
|*  6 |       INDEX RANGE SCAN                 | t1pk |     1 |       |     2   (0)| 00:00:01 |       |       |
|   7 |      PARTITION RANGE ITERATOR          |      |     1 |    41 |     2   (0)| 00:00:01 |   KEY |   KEY |
|*  8 |       TABLE ACCESS BY LOCAL INDEX ROWID| t3   |     1 |    41 |     2   (0)| 00:00:01 |   KEY |   KEY |
|*  9 |        INDEX RANGE SCAN                | t3fk |     1 |       |     1   (0)| 00:00:01 |   KEY |   KEY |
|* 10 |     INDEX UNIQUE SCAN                  | t2pk |     1 |       |     0   (0)| 00:00:01 |       |       |
|* 11 |    TABLE ACCESS BY GLOBAL INDEX ROWID  | t2   |     1 |    16 |     1   (0)| 00:00:01 | ROWID | ROWID |
---------------------------------------------------------------------------------------------------------------

Second select:

-----------------------------------------------------------------------------------------------------------------
| Id  | Operation                                | Name | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
-----------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                         |      |     1 |   311 |     7  (15)| 00:00:01 |       |       |
|   1 |  NESTED LOOPS                            |      |       |       |            |          |       |       |
|   2 |   NESTED LOOPS                           |      |     1 |   311 |     7  (15)| 00:00:01 |       |       |
|   3 |    NESTED LOOPS                          |      |     1 |   253 |     6  (17)| 00:00:01 |       |       |
|   4 |     TABLE ACCESS BY GLOBAL INDEX ROWID   | t1   |     1 |   164 |     3   (0)| 00:00:01 | ROWID | ROWID |
|*  5 |      INDEX RANGE SCAN                    | t1pk |     1 |       |     2   (0)| 00:00:01 |       |       |
|   6 |     VIEW PUSHED PREDICATE                |      |     1 |    89 |     3  (34)| 00:00:01 |       |       |
|   7 |      SORT GROUP BY                       |      |     1 |    41 |     3  (34)| 00:00:01 |       |       |
|*  8 |       FILTER                             |      |       |       |            |          |       |       |
|   9 |        PARTITION RANGE SINGLE            |      |     1 |    41 |     2   (0)| 00:00:01 |   KEY |   KEY |
|* 10 |         TABLE ACCESS BY LOCAL INDEX ROWID| t3   |     1 |    41 |     2   (0)| 00:00:01 |   KEY |   KEY |
|* 11 |          INDEX RANGE SCAN                | t3fk |     1 |       |     1   (0)| 00:00:01 |   KEY |   KEY |
|* 12 |    INDEX UNIQUE SCAN                     | t2pk |     1 |       |     0   (0)| 00:00:01 |       |       |
|* 13 |   TABLE ACCESS BY GLOBAL INDEX ROWID     | t2   |     1 |    58 |     1   (0)| 00:00:01 | ROWID | ROWID |
-----------------------------------------------------------------------------------------------------------------

While I vaguely understand what the optimizer is doing, I can't conclude which query is better. Is there something glaringly obvious in the plans that should instruct me to prefer one query over the other, or do they seem rather equivalent performance-wise?

Note about limiting by t1.id: I have simplified the description of the tables and the relationships between them for the sake of simplicity. In this sense, it may seem that filtering by t1.id won't be performant for t3 because it apparently doesn't have an index containing t1.id. In reality, t2 is a weak entity, with primary key (t1.id, t2.id) and t3's foreign key towards t2 is exactly like that, including t1.id. That's why there is an index range scan on t3fk in the last two plans above.

Best Answer

The optimizer always tries to reduce the amount of data as quickly as possible. If not, your statistics might not be good.

Your plan 1 shows that less rows are processed this is good. The optimizer was able to reduce the amount of data more quickly. The numbers might not be exactly true but it gives you an idea based on the optimizer statistics.

Plan 1:

---------------------------------------------------------------------------------------------------------------
| Id  | Operation                              | Name | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
---------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                       |      |     1 |    79 |     7  (15)| 00:00:01 |       |       |
|   1 |  HASH GROUP BY                         |      |     1 |    79 |     7  (15)| 00:00:01 |       |       |
|   2 |   NESTED LOOPS                         |      |       |       |            |          |       |       |
|   3 |    NESTED LOOPS                        |      |     1 |    79 |     6   (0)| 00:00:01 |       |       |
|   4 |     NESTED LOOPS                       |      |     1 |    63 |     5   (0)| 00:00:01 |       |       |
|   5 |      TABLE ACCESS BY GLOBAL INDEX ROWID| t1   |     1 |    22 |     3   (0)| 00:00:01 | ROWID | ROWID |
|*  6 |       INDEX RANGE SCAN                 | t1pk |     1 |       |     2   (0)| 00:00:01 |       |       |
|   7 |      PARTITION RANGE ITERATOR          |      |     1 |    41 |     2   (0)| 00:00:01 |   KEY |   KEY |
|*  8 |       TABLE ACCESS BY LOCAL INDEX ROWID| t3   |     1 |    41 |     2   (0)| 00:00:01 |   KEY |   KEY |
|*  9 |        INDEX RANGE SCAN                | t3fk |     1 |       |     1   (0)| 00:00:01 |   KEY |   KEY |
|* 10 |     INDEX UNIQUE SCAN                  | t2pk |     1 |       |     0   (0)| 00:00:01 |       |       |
|* 11 |    TABLE ACCESS BY GLOBAL INDEX ROWID  | t2   |     1 |    16 |     1   (0)| 00:00:01 | ROWID | ROWID |
---------------------------------------------------------------------------------------------------------------

It shows that Oracle did a HASH GROUP BY while plan 2 did a SORT GROUP BY.

-----------------------------------------------------------------------------------------------------------------
| Id  | Operation                                | Name | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
-----------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                         |      |     1 |   311 |     7  (15)| 00:00:01 |       |       |
|   1 |  NESTED LOOPS                            |      |       |       |            |          |       |       |
|   2 |   NESTED LOOPS                           |      |     1 |   311 |     7  (15)| 00:00:01 |       |       |
|   3 |    NESTED LOOPS                          |      |     1 |   253 |     6  (17)| 00:00:01 |       |       |
|   4 |     TABLE ACCESS BY GLOBAL INDEX ROWID   | t1   |     1 |   164 |     3   (0)| 00:00:01 | ROWID | ROWID |
|*  5 |      INDEX RANGE SCAN                    | t1pk |     1 |       |     2   (0)| 00:00:01 |       |       |
|   6 |     VIEW PUSHED PREDICATE                |      |     1 |    89 |     3  (34)| 00:00:01 |       |       |
|   7 |      SORT GROUP BY                       |      |     1 |    41 |     3  (34)| 00:00:01 |       |       |
|*  8 |       FILTER                             |      |       |       |            |          |       |       |
|   9 |        PARTITION RANGE SINGLE            |      |     1 |    41 |     2   (0)| 00:00:01 |   KEY |   KEY |
|* 10 |         TABLE ACCESS BY LOCAL INDEX ROWID| t3   |     1 |    41 |     2   (0)| 00:00:01 |   KEY |   KEY |
|* 11 |          INDEX RANGE SCAN                | t3fk |     1 |       |     1   (0)| 00:00:01 |   KEY |   KEY |
|* 12 |    INDEX UNIQUE SCAN                     | t2pk |     1 |       |     0   (0)| 00:00:01 |       |       |
|* 13 |   TABLE ACCESS BY GLOBAL INDEX ROWID     | t2   |     1 |    58 |     1   (0)| 00:00:01 | ROWID | ROWID |
-----------------------------------------------------------------------------------------------------------------

Conclusion: In most cases I would prefer a HASH GROUP BY. So in plan 1 Oracle does a better the GROUP BY much later with much less data. 2 reasons why plan 1 is better.

I assume that this is not the only SQL statement which will be using this join constellation. You will have to decide for each SQL which way is the best.

If you have a SQL similar to the following, the with clause can be much better:

with t3g as ( -- as in 't3 grouped'
    select id2, count(1) t3count
    from t3
    where id = :1 -- new line !!!!!!!!
    group by id2
)
select t1.id t1id, t2id, t3count
from t1 join t2  on (t1.p = t2.p  and t1.id = t2.id1)
        join t3g on (t2.p = t3g.p and t2.id = id2)