I am dealing with a performance issue for a very simple structure involving simple queries.
Consider two tables called TBL01 and TBL02. TBL01 has a master-detail relationship with TBL02. The primary key for TBL01 is called CONFIGURATION_ID. TBL02 has CONFIGURATION_ID as a foreign key to TBL01 which is also indexed. TBL01 has 2.8 million rows and TBL02 has maybe 10 million rows.
A simple INNER JOIN on TBL01.CONFIGURATION_ID = TBL02.CONFIGURATION_ID is kinda slow. By slow I mean it takes a second to return 10 rows with 15 columns in total from both TBL01 and TBL02. If I start using sorting etc, things get worse. I have dealt with very big tables in the past but never had an issue. Could there be something missing?
What should my next steps be in order to improve the performance? BITMAP index with JOIN on CONFIGURATION_ID? Would that help? Any other tips?
Note 1: I am using ORACLE 12.
Note 2: This might be obvious, but if I add a filter after the join things are better.
Best Answer
What you described is normal behaviour.
Here the database performs the join for the whole resultset with a HASH JOIN. By default, the optimizer generates a plan to return all rows from a query, which is configured with the
optimizer_mode
parameter.Your query says: return all rows from TBL1 and TBL2 that match on CONFIGURATION_ID. You never told the database you wanted only the first any 10 rows. If you want the optimizer to generate a plan that returns the first rows quicker, tell that to the optimizer: change the optimizer mode, example:
This time, the database performed a NESTED LOOPS JOIN.
The significant difference between HASH JOIN and NESTED LOOPS JOIN in this case: HASH JOIN has to read one of the tables completely to build the hash table for the join. NESTED LOOPS JOIN can work row by row and do index lookups by ID, then stop reading the tables after finding the 10 rows. So even if you fetch only 10 rows, with a HASH JOIN, the database will read the whole 2.8 or 10 million rows from TBL1 or TBL2, depending on the order the optimizer chose.
To make things even worse, if you want to get the first 10 rows by a speficied order, with a HASH JOIN like above, the database has to read both tables completely, perform the join for all rows, sort the ~10 million rows, and return only the first 10 of them.
With a NESTED LOOPS JOIN, if you have an index on the sort column, the database can read the rows even in the sorting order using that index, can do row by row processing, and stop after finding the first 10 rows.
The optimizer mode can be set with a parameter as above, or it can be defined at statement level as well, with a hint, like: