Oracle Optimizer refuses to use the index

indexoptimizationoracleoracle-12cstatistics

I have a relatively simple query that SELECT data in a table.

SELECT l.columnRecovered FROM schema.TABLE l
WHERE (column1 = :a or (:a is null AND column1 is null))
AND (column2 = :b or (:b is null AND column2 is null))
AND (column3 = :c or (:c is null AND column3 is null));

This generates the following plan :

-------------------------------------------------------------------------------
| Id  | Operation         | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |           |       |       |   309 (100)|          |
|*  1 |  TABLE ACCESS FULL| TABLE     |     1 |    56 |   309   (1)| 00:00:01 |
-------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter((("COLUMN2"=:3 OR ("COLUMN2" IS NULL AND :4 IS NULL)) AND
              ("COLUMN1"=:1 OR ("COLUMN1" IS NULL AND :2 IS NULL)) AND ((:6 IS NULL
              AND "COLUMN3" IS NULL) OR "COLUMN3"=:5)))

On this table, I have a UNIQUE INDEX that contains all those 3 columns.
But the optimizer prefer doing a FULL SCAN on the table rather than using the INDEX.

Where things become to go strange, is when I try this query without my binds but by replacing the values directly in the query.
Of course the SQL_ID changes, but this time the optimizer decides to use the INDEX and goes from 1104 gets to 6 (which is of course a great optimization).

---------------------------------------------------------------------------------------------
| Id  | Operation                   | Name          | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |               |     1 |    56 |     3   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| TABLE         |     1 |    56 |     3   (0)| 00:00:01 |
|*  2 |   INDEX UNIQUE SCAN         | TABLE_IDX     |     1 |       |     2   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("COLUMN3"='DATA3' AND "COLUMN2"='DATA2' AND "COLUMN1"='DATA1')

Statistics are up to date and there are no histograms on this table.

Any way to make the optimizer use that index without modifying the query ?

I think the optimizer thinks that the possibility to have 3 NULLs might happen so this disable him from using the index is it right ?

-------------------------------------------------------------------------------
| Id  | Operation         | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |           |     1 |    56 |   199   (2)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| TABLE     |     1 |    56 |   199   (2)| 00:00:01 |
-------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("COLUMN2" IS NULL AND "COLUMN1" IS NULL AND "COLUMN3" IS
              NULL)

Best Answer

The optimizer does not use your index, not because it does not want to, but becasue it can not. For a query like that the index does not cover all possible outcomes.

When :a, :b, :c are all NULL, and you are searching for rows where column1, column2, column3 are all NULL, those rows are not indexed by a (column1, column2, column3) b-tree index, so it can not be used for retreiving the matching rows.

Adding NOT NULL constraints may help, but if you have NULL values, that is of course not possible (and I am assuming you have NULL values, otherwise why would you construct a query like this).

You could also try creating an index that includes a constant value, so even those rows will be indexed, where the indexed columns are NULL. So instead of an (column1, column2, column3) index, you could try creating a (column1, column2, column3, 'A') index.

The best solution in my opinion is however to have different queries generated whenever a bind variable was not assigned a value. Queries like the above are results of trying to handle multiple cases with a single query. This may seem to be elegant and compact, and yes, I know, generating different queries based on the presence of parameters can be cumbersome, but I have seen this kind of laziness causing severe performance issues countless times.

Yes, Oracle has some tricks such as NVL optimization, OR-expansion, for handling such predicates. Those work perfectly in Powerpoint and training materials, with simple examples, but in reality, with complex queries they can not be always used.