Different execution plan for the same query if I change a value in the predicate

execution-planoracleperformancequery-performance

I am getting two different execution plans for the same query only by changing the value in the predicate, the first one uses nested loops and the other one hash join which takes a lot longer.

explain plan for SELECT tbl0.ID
                    FROM mcoberturadetalleprimac tbl0 
                    INNER JOIN mcoberturaseccionc tbl1 ON tbl0.mcoberturaseccionc_id = tbl1.ID
                    INNER JOIN mseccionincisoc tbl2 ON tbl1.mseccionincisoc_id = tbl2.ID
                    INNER JOIN mincisoc tbl3 on tbl3.id = tbl2.mincisoc_id
                    WHERE tbl3.mcotizacionc_id = 10371; 

Plan hash value: 3143545222

------------------------------------------------------------------------------------------------------------------
| Id  | Operation                       | Name                           | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                |                                |    47 |  1974 |    82   (2)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID    | MCOBERTURADETALLEPRIMAC        |     1 |    11 |     1   (0)| 00:00:01 |
|   2 |   NESTED LOOPS                  |                                |    47 |  1974 |    82   (2)| 00:00:01 |
|   3 |    NESTED LOOPS                 |                                |    49 |  1519 |    15   (0)| 00:00:01 |
|   4 |     NESTED LOOPS                |                                |     5 |   100 |     9   (0)| 00:00:01 |
|   5 |      TABLE ACCESS BY INDEX ROWID| MINCISOC                       |     5 |    50 |     2   (0)| 00:00:01 |
|*  6 |       INDEX RANGE SCAN          | IDX_MINCISOC_01                |     5 |       |     1   (0)| 00:00:01 |
|   7 |      TABLE ACCESS BY INDEX ROWID| MSECCIONINCISOC                |     1 |    10 |     1   (0)| 00:00:01 |
|*  8 |       INDEX RANGE SCAN          | IDX_MSECCIONINCISOC_01         |     1 |       |     1   (0)| 00:00:01 |
|   9 |     TABLE ACCESS BY INDEX ROWID | MCOBERTURASECCIONC             |    10 |   110 |     1   (0)| 00:00:01 |
|* 10 |      INDEX RANGE SCAN           | IDX_MCOBERTURASECCIONC_01      |    11 |       |     1   (0)| 00:00:01 |
|* 11 |    INDEX RANGE SCAN             | IDX_MCOBERTURADETALLEPRIMAC_01 |     1 |       |     1   (0)| 00:00:01 |
------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   6 - access("TBL3"."MCOTIZACIONC_ID"=10371)
   8 - access("TBL3"."ID"="TBL2"."MINCISOC_ID")
  10 - access("TBL1"."MSECCIONINCISOC_ID"="TBL2"."ID")
  11 - access("TBL0"."MCOBERTURASECCIONC_ID"="TBL1"."ID")

Now by just changing tbl3.mcotizacionc_id from 10371 to 368836 I get a different plan (hash join plan) which is very slow. It takes 9 secs:

explain plan for SELECT tbl0.ID
                    FROM mcoberturadetalleprimac tbl0 
                    INNER JOIN mcoberturaseccionc tbl1 ON tbl0.mcoberturaseccionc_id = tbl1.ID
                    INNER JOIN mseccionincisoc tbl2 ON tbl1.mseccionincisoc_id = tbl2.ID
                    INNER JOIN mincisoc tbl3 on tbl3.id = tbl2.mincisoc_id
                    WHERE tbl3.mcotizacionc_id = 368836;

PLAN_TABLE_OUTPUT

Plan hash value: 604514386

------------------------------------------------------------------------------------------------------------------
| Id  | Operation                      | Name                    | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT               |                         | 45922 |  1883K|       |  9354   (4)| 00:01:53 |
|*  1 |  HASH JOIN                     |                         | 45922 |  1883K|  2008K|  9354   (4)| 00:01:53 |
|*  2 |   HASH JOIN                    |                         | 47806 |  1447K|       |  2879   (5)| 00:00:35 |
|*  3 |    HASH JOIN                   |                         |  4712 | 94240 |       |   329   (4)| 00:00:04 |
|   4 |     TABLE ACCESS BY INDEX ROWID| MINCISOC                |  4712 | 47120 |       |    69   (0)| 00:00:01 |
|*  5 |      INDEX RANGE SCAN          | IDX_MINCISOC_01         |  4712 |       |       |     9   (0)| 00:00:01 |
|   6 |     TABLE ACCESS FULL          | MSECCIONINCISOC         |   399K|  3905K|       |   255   (3)| 00:00:04 |
|   7 |    TABLE ACCESS FULL           | MCOBERTURASECCIONC      |  4057K|    42M|       |  2510   (4)| 00:00:31 |
|   8 |   TABLE ACCESS FULL            | MCOBERTURADETALLEPRIMAC |  3897K|    40M|       |  2058   (4)| 00:00:25 |
------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("TBL0"."MCOBERTURASECCIONC_ID"="TBL1"."ID")
   2 - access("TBL1"."MSECCIONINCISOC_ID"="TBL2"."ID")
   3 - access("TBL3"."ID"="TBL2"."MINCISOC_ID")
   5 - access("TBL3"."MCOTIZACIONC_ID"=368836)

I can force that query to use nested loops by using query hints and it will give me exactly the same first plan (hash 3143545222) and takes only 37 ms to execute:

  SELECT /*+ USE_NL(tbl2 tbl3 tbl1 tbl0)  */ tbl0.ID
                    FROM mcoberturadetalleprimac tbl0 
                    INNER JOIN mcoberturaseccionc tbl1 ON tbl0.mcoberturaseccionc_id = tbl1.ID
                    INNER JOIN mseccionincisoc tbl2 ON tbl1.mseccionincisoc_id = tbl2.ID
                    INNER JOIN mincisoc tbl3 on tbl3.id = tbl2.mincisoc_id
                    WHERE tbl3.mcotizacionc_id = 368836;   

I think I shouldn't be using query hints and the query optimizer should do it. I have already run statistics. What could be the reason of this?

Oracle version:

Oracle Database 10g Release 10.2.0.4.0 - 64bit Production

UPDATE:

I have the following histograms, it might be related as per comment:

TABLE_NAME  COLUMN_NAME     NUM_DISTINCT    LOW_VALUE   HIGH_VALUE  NUM_BUCKETS HISTOGRAM
MINCISOC    MCOTIZACIONC_ID 80667           C25829      C3255F04    254         HEIGHT BALANCED

NUM_DISTINCT says it has 80667 but in reality if I run a count on that column I get 257369:

SELECT count(distinct mcotizacionc_id)
                    FROM mincisoc     

Best Answer

The problem was histograms, I ran statistics and disabled histogram creation and the execution plan used nested loops:

BEGIN
  DBMS_STATS.GATHER_table_STATS (OWNNAME => 'MIDAS', TABNAME => 'MINCISOC', 
  METHOD_OPT => 'FOR ALL COLUMNS SIZE 1');
END;

If I run it with FOR ALL COLUMNS SIZE AUTO again the same problem because it uses hash join. Thanks to Phil for the suggestion.