Should I group by and then join, or join and then group by

group byoracleoracle-11g-r2partitioning

(oracle's optimizer does the same thing for both approaches when just two tables are involved, thus my question involves three tables)

Assume three tables t1, t2 and t3 with a pairwise 1:n relationship from left to right:

t1 (1)--(n) t2 (1)--(n) t3

each partitioned on a common column p used in all joins and each with a primary key id, with foreign keys t2.id1 -> t1.id and t3.id2 -> t2.id.

In reality, the n (both of them) are small. Possibly about 2 t2 rows for each t1 row and about 3 t3 rows for each one from t2.

I wish to join these tables like this:

select t1.id t1id, t2.id t2id, count(1) t3count
from t1 join t2 on (t1.p = t2.p and t1.id = t2.id1)
        join t3 on (t2.p = t3.p and t2.id = t3.id2)
group by t1.id, t2.id

It gives the count of t3 rows for each t2 row and joins the t1 data to that. You could imagine getting the following output:

t1id t2id t3count
---- ---- -------
   1    1      3
   1    2      1
   1    3      2
   2    4      5
   3    5      1
   3    6      1

Now compare this query to:

with t3g as ( -- as in 't3 grouped'
    select id2, count(1) t3count
    from t3
    group by id2
)
select t1.id t1id, t2id, t3count
from t1 join t2  on (t1.p = t2.p  and t1.id = t2.id1)
        join t3g on (t2.p = t3g.p and t2.id = id2)

(This latter could have been written with a nested select instead of with).

Here I basically pre-group the t3 table, and then join it with the other two tables.

The first select gives me the following plan in oracle:

------------------------------------------------------------------------------------------------------
| Id  | Operation             | Name | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     | Pstart| Pstop |
------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |      | 69578 |  5367K|       |  2363   (1)| 00:00:01 |       |       |
|   1 |  PARTITION RANGE ALL  |      | 69578 |  5367K|       |  2363   (1)| 00:00:01 |     1 |1048575|
|   2 |   HASH GROUP BY       |      | 69578 |  5367K|  6336K|  2363   (1)| 00:00:01 |       |       |
|*  3 |    HASH JOIN          |      | 69578 |  5367K|       |  1083   (2)| 00:00:01 |       |       |
|   4 |     TABLE ACCESS FULL | t2   | 81635 |  1275K|       |   256   (1)| 00:00:01 |     1 |1048575|
|*  5 |     HASH JOIN         |      | 70033 |  4308K|       |   823   (1)| 00:00:01 |       |       |
|   6 |      TABLE ACCESS FULL| t1   | 81622 |  1753K|       |   616   (1)| 00:00:01 |     1 |1048575|
|   7 |      TABLE ACCESS FULL| t3   | 70033 |  2804K|       |   204   (1)| 00:00:01 |     1 |1048575|
------------------------------------------------------------------------------------------------------

The optimizer decides to join t1 and t3 first, otherwise, nothing unexpected.

The second query gives the following plan:

----------------------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name    | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     | Pstart| Pstop |
----------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |         | 69578 |    21M|       |  3125   (1)| 00:00:01 |       |       |
|*  1 |  HASH JOIN                   |         | 69578 |    21M|  8144K|  3125   (1)| 00:00:01 |       |       |
|   2 |   PART JOIN FILTER CREATE    | :BF0000 | 70033 |  7317K|       |   950   (1)| 00:00:01 |       |       |
|   3 |    VIEW                      |         | 70033 |  7317K|       |   950   (1)| 00:00:01 |       |       |
|   4 |     HASH GROUP BY            |         | 70033 |  2804K|  4424K|   950   (1)| 00:00:01 |       |       |
|   5 |      PARTITION RANGE ALL     |         | 70033 |  2804K|       |   204   (1)| 00:00:01 |     1 |1048575|
|   6 |       TABLE ACCESS FULL      | t3      | 70033 |  2804K|       |   204   (1)| 00:00:01 |     1 |1048575|
|   7 |   PARTITION RANGE JOIN-FILTER|         | 81622 |    17M|       |   876   (1)| 00:00:01 |:BF0000|:BF0000|
|*  8 |    HASH JOIN                 |         | 81622 |    17M|       |   876   (1)| 00:00:01 |       |       |
|   9 |     TABLE ACCESS FULL        | t2      | 81635 |  4623K|       |   256   (1)| 00:00:01 |:BF0000|:BF0000|
|  10 |     TABLE ACCESS FULL        | t1      | 81622 |    12M|       |   616   (1)| 00:00:01 |:BF0000|:BF0000|
----------------------------------------------------------------------------------------------------------------

Now here the optimizer first groups t3 as instructed, then joins t1 and t2 and then utilizes partitioning to join the grouped t3 with the joined t1-t2.

If I add a where clause limiting by t1.id, I get the following, rather similar plans:

First select:

---------------------------------------------------------------------------------------------------------------
| Id  | Operation                              | Name | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
---------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                       |      |     1 |    79 |     7  (15)| 00:00:01 |       |       |
|   1 |  HASH GROUP BY                         |      |     1 |    79 |     7  (15)| 00:00:01 |       |       |
|   2 |   NESTED LOOPS                         |      |       |       |            |          |       |       |
|   3 |    NESTED LOOPS                        |      |     1 |    79 |     6   (0)| 00:00:01 |       |       |
|   4 |     NESTED LOOPS                       |      |     1 |    63 |     5   (0)| 00:00:01 |       |       |
|   5 |      TABLE ACCESS BY GLOBAL INDEX ROWID| t1   |     1 |    22 |     3   (0)| 00:00:01 | ROWID | ROWID |
|*  6 |       INDEX RANGE SCAN                 | t1pk |     1 |       |     2   (0)| 00:00:01 |       |       |
|   7 |      PARTITION RANGE ITERATOR          |      |     1 |    41 |     2   (0)| 00:00:01 |   KEY |   KEY |
|*  8 |       TABLE ACCESS BY LOCAL INDEX ROWID| t3   |     1 |    41 |     2   (0)| 00:00:01 |   KEY |   KEY |
|*  9 |        INDEX RANGE SCAN                | t3fk |     1 |       |     1   (0)| 00:00:01 |   KEY |   KEY |
|* 10 |     INDEX UNIQUE SCAN                  | t2pk |     1 |       |     0   (0)| 00:00:01 |       |       |
|* 11 |    TABLE ACCESS BY GLOBAL INDEX ROWID  | t2   |     1 |    16 |     1   (0)| 00:00:01 | ROWID | ROWID |
---------------------------------------------------------------------------------------------------------------

Second select:

-----------------------------------------------------------------------------------------------------------------
| Id  | Operation                                | Name | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
-----------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                         |      |     1 |   311 |     7  (15)| 00:00:01 |       |       |
|   1 |  NESTED LOOPS                            |      |       |       |            |          |       |       |
|   2 |   NESTED LOOPS                           |      |     1 |   311 |     7  (15)| 00:00:01 |       |       |
|   3 |    NESTED LOOPS                          |      |     1 |   253 |     6  (17)| 00:00:01 |       |       |
|   4 |     TABLE ACCESS BY GLOBAL INDEX ROWID   | t1   |     1 |   164 |     3   (0)| 00:00:01 | ROWID | ROWID |
|*  5 |      INDEX RANGE SCAN                    | t1pk |     1 |       |     2   (0)| 00:00:01 |       |       |
|   6 |     VIEW PUSHED PREDICATE                |      |     1 |    89 |     3  (34)| 00:00:01 |       |       |
|   7 |      SORT GROUP BY                       |      |     1 |    41 |     3  (34)| 00:00:01 |       |       |
|*  8 |       FILTER                             |      |       |       |            |          |       |       |
|   9 |        PARTITION RANGE SINGLE            |      |     1 |    41 |     2   (0)| 00:00:01 |   KEY |   KEY |
|* 10 |         TABLE ACCESS BY LOCAL INDEX ROWID| t3   |     1 |    41 |     2   (0)| 00:00:01 |   KEY |   KEY |
|* 11 |          INDEX RANGE SCAN                | t3fk |     1 |       |     1   (0)| 00:00:01 |   KEY |   KEY |
|* 12 |    INDEX UNIQUE SCAN                     | t2pk |     1 |       |     0   (0)| 00:00:01 |       |       |
|* 13 |   TABLE ACCESS BY GLOBAL INDEX ROWID     | t2   |     1 |    58 |     1   (0)| 00:00:01 | ROWID | ROWID |
-----------------------------------------------------------------------------------------------------------------

While I vaguely understand what the optimizer is doing, I can't conclude which query is better. Is there something glaringly obvious in the plans that should instruct me to prefer one query over the other, or do they seem rather equivalent performance-wise?

Note about limiting by t1.id: I have simplified the description of the tables and the relationships between them for the sake of simplicity. In this sense, it may seem that filtering by t1.id won't be performant for t3 because it apparently doesn't have an index containing t1.id. In reality, t2 is a weak entity, with primary key (t1.id, t2.id) and t3's foreign key towards t2 is exactly like that, including t1.id. That's why there is an index range scan on t3fk in the last two plans above.

Best Answer

The optimizer always tries to reduce the amount of data as quickly as possible. If not, your statistics might not be good.

Your plan 1 shows that less rows are processed this is good. The optimizer was able to reduce the amount of data more quickly. The numbers might not be exactly true but it gives you an idea based on the optimizer statistics.

Plan 1:

---------------------------------------------------------------------------------------------------------------
| Id  | Operation                              | Name | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
---------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                       |      |     1 |    79 |     7  (15)| 00:00:01 |       |       |
|   1 |  HASH GROUP BY                         |      |     1 |    79 |     7  (15)| 00:00:01 |       |       |
|   2 |   NESTED LOOPS                         |      |       |       |            |          |       |       |
|   3 |    NESTED LOOPS                        |      |     1 |    79 |     6   (0)| 00:00:01 |       |       |
|   4 |     NESTED LOOPS                       |      |     1 |    63 |     5   (0)| 00:00:01 |       |       |
|   5 |      TABLE ACCESS BY GLOBAL INDEX ROWID| t1   |     1 |    22 |     3   (0)| 00:00:01 | ROWID | ROWID |
|*  6 |       INDEX RANGE SCAN                 | t1pk |     1 |       |     2   (0)| 00:00:01 |       |       |
|   7 |      PARTITION RANGE ITERATOR          |      |     1 |    41 |     2   (0)| 00:00:01 |   KEY |   KEY |
|*  8 |       TABLE ACCESS BY LOCAL INDEX ROWID| t3   |     1 |    41 |     2   (0)| 00:00:01 |   KEY |   KEY |
|*  9 |        INDEX RANGE SCAN                | t3fk |     1 |       |     1   (0)| 00:00:01 |   KEY |   KEY |
|* 10 |     INDEX UNIQUE SCAN                  | t2pk |     1 |       |     0   (0)| 00:00:01 |       |       |
|* 11 |    TABLE ACCESS BY GLOBAL INDEX ROWID  | t2   |     1 |    16 |     1   (0)| 00:00:01 | ROWID | ROWID |
---------------------------------------------------------------------------------------------------------------

It shows that Oracle did a HASH GROUP BY while plan 2 did a SORT GROUP BY.

-----------------------------------------------------------------------------------------------------------------
| Id  | Operation                                | Name | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
-----------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                         |      |     1 |   311 |     7  (15)| 00:00:01 |       |       |
|   1 |  NESTED LOOPS                            |      |       |       |            |          |       |       |
|   2 |   NESTED LOOPS                           |      |     1 |   311 |     7  (15)| 00:00:01 |       |       |
|   3 |    NESTED LOOPS                          |      |     1 |   253 |     6  (17)| 00:00:01 |       |       |
|   4 |     TABLE ACCESS BY GLOBAL INDEX ROWID   | t1   |     1 |   164 |     3   (0)| 00:00:01 | ROWID | ROWID |
|*  5 |      INDEX RANGE SCAN                    | t1pk |     1 |       |     2   (0)| 00:00:01 |       |       |
|   6 |     VIEW PUSHED PREDICATE                |      |     1 |    89 |     3  (34)| 00:00:01 |       |       |
|   7 |      SORT GROUP BY                       |      |     1 |    41 |     3  (34)| 00:00:01 |       |       |
|*  8 |       FILTER                             |      |       |       |            |          |       |       |
|   9 |        PARTITION RANGE SINGLE            |      |     1 |    41 |     2   (0)| 00:00:01 |   KEY |   KEY |
|* 10 |         TABLE ACCESS BY LOCAL INDEX ROWID| t3   |     1 |    41 |     2   (0)| 00:00:01 |   KEY |   KEY |
|* 11 |          INDEX RANGE SCAN                | t3fk |     1 |       |     1   (0)| 00:00:01 |   KEY |   KEY |
|* 12 |    INDEX UNIQUE SCAN                     | t2pk |     1 |       |     0   (0)| 00:00:01 |       |       |
|* 13 |   TABLE ACCESS BY GLOBAL INDEX ROWID     | t2   |     1 |    58 |     1   (0)| 00:00:01 | ROWID | ROWID |
-----------------------------------------------------------------------------------------------------------------

Conclusion: In most cases I would prefer a HASH GROUP BY. So in plan 1 Oracle does a better the GROUP BY much later with much less data. 2 reasons why plan 1 is better.

I assume that this is not the only SQL statement which will be using this join constellation. You will have to decide for each SQL which way is the best.

If you have a SQL similar to the following, the with clause can be much better:

with t3g as ( -- as in 't3 grouped'
    select id2, count(1) t3count
    from t3
    where id = :1 -- new line !!!!!!!!
    group by id2
)
select t1.id t1id, t2id, t3count
from t1 join t2  on (t1.p = t2.p  and t1.id = t2.id1)
        join t3g on (t2.p = t3g.p and t2.id = id2)

UPDATE 2014-02-21 10:51 EST

With regard to your original question, you stated

Partitioning on SHA1 PRIMARY KEY
Partitioned MyISAM table with 30 partitions

OBSERVATION #1

Given the fact that SHA1 output is always 40 characters, my guess is that you are not using RANGE partitioning.

Here is what MySQL Documentation says on HASH partitioning

Partitioning by HASH is used primarily to ensure an even distribution of data among a predetermined number of partitions. With range or list partitioning, you must specify explicitly into which partition a given column value or set of column values is to be stored; with hash partitioning, MySQL takes care of this for you, and you need only specify a column value or expression based on a column value to be hashed and the number of partitions into which the partitioned table is to be divided.

Since you are not using RANGE partitioning

MySQL will place new data into a partition after figuring out which partition to choose based on how even data are distributed.
There is no mechanism to reasonably determine any ordering of the PRIMARY KEY.
Using PARTITION BY HASH creates buckets for data with no rhyme or reason for ordering.
Using PARTITION BY LINEAR HASH could create ordered buckets.

The overall problem is that the PRIMARY KEY is not an INT. You cannot set up any real RANGE partitioning for a non-scalar type (such as CHAR string). Given this scenario, MySQL cannot avoid doing a SELECT lookup without opening every partition.

OBSERVATION #2

A partitioned MyISAM table with 30 partitions is made up of

30 .MYD files
30 .MYI files

Each MYD requires a file handle. Each MYI requires a file handle. Thus, you can have 60 file handles open. Here is what you can do to see how many file handle open up after a mysql restart

FLUSH TABLES my_partitioned;
SELECT variable_value INTO @open1
FROM information_schema.global_status
WHERE variable_name='open_files';
SELECT bar FROM my_partitioned WHERE foo_ak = 42;
SELECT variable_value INTO @open2
FROM information_schema.global_status
WHERE variable_name='open_files';
SET @opendiff = @open2 - @open1;
SELECT @opendiff;

What you can conclude:

If @opendiff is close to 60, MySQL tried to open all the partitions.
If @opendiff is close to 0, MySQL had pruned the partitions

The second bulletpoint is what you want because it will show how heavy handed MySQL is at opening partitions,... or not.

Why the optimizer does not use an Index Unique Scan

I have a big Index Organized Table (IOT) with 74 columns

Why do you think that this table is a candidate for an IOT? IMHO, IOT's are meant to be narrow tables that don't change often. I have done testing with narrow tables and have found IOT's to do inserts at a slower rate than a traditional table with indexes. You may want to start by recreating this as a traditional table and then think about what indexes you want on the table. Also there is a note in this article on IOT's. Performance Tuning Guide

Best Answer

Related Solutions

Mysql – Is searching an alternate index on a partitioned theISAM table concurrent or serial

UPDATE 2014-02-21 10:51 EST

OBSERVATION #1

OBSERVATION #2

Why the optimizer does not use an Index Unique Scan

Related Question