How to improve performance for table JOINS involving millions of rows

indexjoin;oracleoracle-12cperformanceperformance-tuning

I am dealing with a performance issue for a very simple structure involving simple queries.

Consider two tables called TBL01 and TBL02. TBL01 has a master-detail relationship with TBL02. The primary key for TBL01 is called CONFIGURATION_ID. TBL02 has CONFIGURATION_ID as a foreign key to TBL01 which is also indexed. TBL01 has 2.8 million rows and TBL02 has maybe 10 million rows.

A simple INNER JOIN on TBL01.CONFIGURATION_ID = TBL02.CONFIGURATION_ID is kinda slow. By slow I mean it takes a second to return 10 rows with 15 columns in total from both TBL01 and TBL02. If I start using sorting etc, things get worse. I have dealt with very big tables in the past but never had an issue. Could there be something missing?

What should my next steps be in order to improve the performance? BITMAP index with JOIN on CONFIGURATION_ID? Would that help? Any other tips?

Note 1: I am using ORACLE 12.

Note 2: This might be obvious, but if I add a filter after the join things are better.

Best Answer

What you described is normal behaviour.

create table t1 as select * from dba_objects;
create table t2 as select * from dba_objects;
delete from t1 where object_id is null;
delete from t2 where object_name = 'T1';
alter table t1 add primary key (object_id);
alter table t2 add constraint t1_fk foreign key (object_id) references t1(object_id);
create index t2_object_id on t2(object_id);
exec dbms_stats.gather_table_stats(user, 'T1');
exec dbms_stats.gather_table_stats(user, 'T2');


explain plan for select * from t1 join t2 on (t1.object_id = t2.object_id);
select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------
Plan hash value: 1838229974

-----------------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      | 23763 |  5337K|       |   516   (1)| 00:00:01 |
|*  1 |  HASH JOIN         |      | 23763 |  5337K|  2952K|   516   (1)| 00:00:01 |
|   2 |   TABLE ACCESS FULL| T1   | 23764 |  2668K|       |   115   (1)| 00:00:01 |
|   3 |   TABLE ACCESS FULL| T2   | 23763 |  2668K|       |   115   (1)| 00:00:01 |
-----------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("T1"."OBJECT_ID"="T2"."OBJECT_ID")

Note
-----
   - this is an adaptive plan

Here the database performs the join for the whole resultset with a HASH JOIN. By default, the optimizer generates a plan to return all rows from a query, which is configured with the optimizer_mode parameter.

SQL> show parameter optimizer_mode

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
optimizer_mode                       string      ALL_ROWS

Your query says: return all rows from TBL1 and TBL2 that match on CONFIGURATION_ID. You never told the database you wanted only the first any 10 rows. If you want the optimizer to generate a plan that returns the first rows quicker, tell that to the optimizer: change the optimizer mode, example:

alter session set optimizer_mode=first_rows_10;

explain plan for select * from t1 join t2 on (t1.object_id = t2.object_id);
select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------
Plan hash value: 209411957

--------------------------------------------------------------------------------------------
| Id  | Operation                    | Name        | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |             |    10 |  2300 |    12   (0)| 00:00:01 |
|   1 |  NESTED LOOPS                |             |    10 |  2300 |    12   (0)| 00:00:01 |
|   2 |   NESTED LOOPS               |             |    10 |  2300 |    12   (0)| 00:00:01 |
|   3 |    TABLE ACCESS FULL         | T2          | 21604 |  2426K|     2   (0)| 00:00:01 |
|*  4 |    INDEX UNIQUE SCAN         | SYS_C006383 |     1 |       |     0   (0)| 00:00:01 |
|   5 |   TABLE ACCESS BY INDEX ROWID| T1          |     1 |   115 |     1   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   4 - access("T1"."OBJECT_ID"="T2"."OBJECT_ID")

This time, the database performed a NESTED LOOPS JOIN.

The significant difference between HASH JOIN and NESTED LOOPS JOIN in this case: HASH JOIN has to read one of the tables completely to build the hash table for the join. NESTED LOOPS JOIN can work row by row and do index lookups by ID, then stop reading the tables after finding the 10 rows. So even if you fetch only 10 rows, with a HASH JOIN, the database will read the whole 2.8 or 10 million rows from TBL1 or TBL2, depending on the order the optimizer chose.

To make things even worse, if you want to get the first 10 rows by a speficied order, with a HASH JOIN like above, the database has to read both tables completely, perform the join for all rows, sort the ~10 million rows, and return only the first 10 of them.

With a NESTED LOOPS JOIN, if you have an index on the sort column, the database can read the rows even in the sorting order using that index, can do row by row processing, and stop after finding the first 10 rows.

The optimizer mode can be set with a parameter as above, or it can be defined at statement level as well, with a hint, like:

select /*+ first_rows(10) */ ...