Selecting index strategy for Oracle pagination query

indexoracleoracle-12c

Following the Oracle docs here, I'm trying to paginate a query on a table with almost 1 million rows.

The problem is that at the end of the day, when table has millions of rows, performance becomes bad. It seems that the index used by Oracle (Fast Full Index Scan) is not taking into account the sorted index, so the whole data is being sorted again.

As an example:

create table sample1(
       id int primary key,
       c1 varchar2(3000) not null,
       c2 varchar2(3000)  not null,
       c3 varchar2(10));

create unique index index_unique3 on sample1(c1, c2);

create sequence seq_id1
  minvalue 0
  maxvalue 999999999999
  start with 1
  increment by 1
  cache 20;

INSERT INTO sample1
  SELECT  seq_id1.nextval,
  dbms_random.string('U',trunc(dbms_random.value(1,3000))),
  dbms_random.string('U',trunc(dbms_random.value(1,3000))),
  dbms_random.string('U',trunc(dbms_random.value(1,1)))
  FROM  dual
  CONNECT BY level <= 5000;

analyze table sample1 compute statistics;

select * from (
  select a.*, rownum r
  from (
    select  c1, c2
    from sample1
    order by  c1, c2
  ) a
  where rownum <= 24 
)
where r >= 2;

With the first rows, execution plan shows "Full Index Scan", but after inserting a couple of thousand rows (and gather statistics again), it begins to use "Index Fast Full Scan".

---------------------------------------------------------------------------------------
| Id  | Operation                  | Name          | Rows | Bytes   | Cost | Time     |
---------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT           |               |   24 |   72408 | 1595 | 00:00:01 |
| * 1 |   VIEW                     |               |   24 |   72408 | 1595 | 00:00:01 |
| * 2 |    COUNT STOPKEY           |               |      |         |      |          |
|   3 |     VIEW                   |               | 2002 | 6014008 | 1595 | 00:00:01 |
| * 4 |      SORT ORDER BY STOPKEY |               | 2002 | 6072066 | 1595 | 00:00:01 |
|   5 |       INDEX FAST FULL SCAN | INDEX_UNIQUE3 | 2002 | 6072066 |  326 | 00:00:01 |
---------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
------------------------------------------
* 1 - filter("R">=2)
* 2 - filter(ROWNUM<=24)
* 4 - filter(ROWNUM<=24)

And, if the columns c1 and c2 are defined as ints, then the execution plan turns
to "Index Full Scan" and timings are pretty good.

-------------------------------------------------------------------------------------
| Id  | Operation            | Name          | Rows    | Bytes    | Cost | Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |               |      24 |      936 |    3 | 00:00:01 |
| * 1 |   VIEW               |               |      24 |      936 |    3 | 00:00:01 |
| * 2 |    COUNT STOPKEY     |               |         |          |      |          |
|   3 |     VIEW             |               |      24 |      624 |    3 | 00:00:01 |
|   4 |      INDEX FULL SCAN | INDEX_UNIQUE1 | 1420002 | 14200020 |    3 | 00:00:01 |
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
------------------------------------------
* 1 - filter("R">=2)
* 2 - filter(ROWNUM<=24)

How can I eliminate the sort operation in the varchar table in order to speed up the query? There is a lot of docs explaining how to do the query, but none tells you how to optimize it.

select * from v$version;
Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 - 64bit Production

Update

For the varchar table, modifying the parameter optimizer_index_cost_adj with value 10 makes Oracle to use Full Index Scan instead of Fast Full Index Scan, but it's still sorting data according to the execution plan.

SELECT STATEMENT, GOAL = ALL_ROWS
 VIEW   Object owner=SYS
  COUNT STOPKEY
   VIEW
    SORT ORDER BY STOPKEY
     INDEX FULL SCAN

Update 1

OK, after some research I found that the problem was the value of NLS_SORT = binary. Setting the value to a language, the problem dissapears.

https://community.oracle.com/thread/2436621

Best Answer

Not exactly sure what your question is. Index fast full scan cannot eliminate the subsequent sort, because it reads index blocks in the order they exist on the media, not necessarily in the index sort order, as explained in documentation:

An index fast full scan reads the index blocks in unsorted order, as they exist on disk. This scan does not use the index to probe the table, but reads the index instead of the table, essentially using the index itself as a table.

and

Unlike a full scan, a fast full scan cannot eliminate a sort operation because it does not read the index in order.

Now as to why the Oracle optimizer chooses fast full scan for VARCHAR2(3000) columns but full scan for integer (I presume you mean NUMBER) columns, I can only speculate that in the latter case the optimizer expects that the index leaf pages will be more densely packed with values, requiring fewer reads. It cannot know the actual lengths of your VARCHAR2 values so it has to assume the worst case (which I guess could be up to 6002 bytes with single-byte characters).

Related Solutions

Why isn’t oracle using an index for distinct query

The reason for this behaviour is that rows where LD is NULL cannot be found in the index. Therefore Oracle has to scan the full table. If the table is created with LD as a NOT NULL column then the optimizer uses this information and does an INDEX FAST FULL SCAN. If you add a "CHECK(LD is not null)" constraint to the table that has not NOT NULL defined for the column LD then the optimizer does not use the information provided by the constraint and makes a full table scan again, even if you gave him a hint. Jonathan Lewis wrote about this behaviour.

The following scripts demonstrate this behaviour for Oracle 11.2.0.3.0

*create_table.sql* inserts data into the table and creates index and statistics

set autotrace off
drop table objects
/
create table objects(
  object_id number,
  owner varchar2(30),
  object_name varchar2(128),
  object_type varchar2(19)
)
/

insert into  objects(
  object_id,
  owner,
  object_name,
  object_type
)
select
  object_id,
  owner,
  object_name,
  object_type
from dba_objects
/


create index idx_object_id on objects(object_id);

exec dbms_stats.gather_table_stats(user,'objects', cascade=>true)

Now run the following script:

spool output
set feedback off
set linesize 300
set trimout on
set trimspool on

@create_table
set autotrace traceonly explain

prompt
prompt 1. plan for query with no constraints:
select distinct object_id
from objects;
rem ---------------------------------------------------

@create_table
alter table objects add constraint nn_object_id check(object_id is not null) validate;
set autotrace traceonly explain

prompt
prompt 2. plan for query with check constraint
select distinct object_id
from objects;
rem ---------------------------------------------------

@create_table
alter table objects modify object_id not null;
set autotrace traceonly explain

prompt
prompt 3.plan for query with NOT NULL column
select distinct object_id
from objects;
rem ---------------------------------------------------

@create_table
create bitmap index bidx_object_type on objects(object_type)
/
set autotrace traceonly explain

prompt
prompt 4.plan for query with bitmap index
select distinct object_type
from objects;
rem ---------------------------------------------------


spool off

This gives the following output

1. plan for query with no constraints:

Execution Plan
----------------------------------------------------------
Plan hash value: 4077265613

------------------------------------------------------------------------------
| Id  | Operation          | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |         | 59063 |   288K|   139   (3)| 00:00:02 |
|   1 |  HASH UNIQUE       |         | 59063 |   288K|   139   (3)| 00:00:02 |
|   2 |   TABLE ACCESS FULL| OBJECTS | 59063 |   288K|   136   (0)| 00:00:02 |
------------------------------------------------------------------------------


2. plan for query with check constraint

Execution Plan
----------------------------------------------------------
Plan hash value: 4077265613

------------------------------------------------------------------------------
| Id  | Operation          | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |         | 59063 |   288K|   139   (3)| 00:00:02 |
|   1 |  HASH UNIQUE       |         | 59063 |   288K|   139   (3)| 00:00:02 |
|   2 |   TABLE ACCESS FULL| OBJECTS | 59063 |   288K|   136   (0)| 00:00:02 |
------------------------------------------------------------------------------


3.plan for query with NOT NULL column

Execution Plan
----------------------------------------------------------
Plan hash value: 4172758181

---------------------------------------------------------------------------------------
| Id  | Operation             | Name          | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |               | 59063 |   288K|    40   (8)| 00:00:01 |
|   1 |  HASH UNIQUE          |               | 59063 |   288K|    40   (8)| 00:00:01 |
|   2 |   INDEX FAST FULL SCAN| IDX_OBJECT_ID | 59063 |   288K|    37   (0)| 00:00:01 |
---------------------------------------------------------------------------------------


4.plan for query with bitmap index

Execution Plan
----------------------------------------------------------
Plan hash value: 2970019208

-------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name             | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                  |    43 |   387 |     6  (34)| 00:00:01 |
|   1 |  HASH UNIQUE                 |                  |    43 |   387 |     6  (34)| 00:00:01 |
|   2 |   BITMAP INDEX FAST FULL SCAN| BIDX_OBJECT_TYPE | 59063 |   519K|     4   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------------

Summary

If there is a normal B*-tree index on the column NULL values are possible in the column, then the optimizer cannot rely only on the information of the index to do the 'select distinc' and makes a TABLE ACCESS FULL.

If there is a normal B*-tree index and a NOT-NULL check constraint on the column the optimizer also does not rely on the information of the index and makes a TABLE ACCESS FULL.

If there is a normal B*-tree index and the column is defined NOT NULL then the optiomizer relies on the information of the index and does a INDEX FAS FULL SCAN.

If there is a bitmap index on the column then the optimzer knows that all information is in the index and does a BITMAP INDEX FAST FULL SCAN

Query against history table not using index (Oracle 12c)

Index usage is obviously possible, but optional.

CREATE TABLE hist (
  HIST_ID INTEGER,
  HIST_TIMESTAMP TIMESTAMP,
  ID INTEGER -- this is the id of the table that is being tracked
  --OTHER COLS
);

CREATE INDEX hist_ix ON hist (ID, HIST_TIMESTAMP);

explain plan for SELECT ID, MAX(HIST_TIMESTAMP) FROM hist WHERE HIST_TIMESTAMP <= :B1 GROUP BY ID;
select * from table(dbms_xplan.display);


Plan hash value: 1027924405

--------------------------------------------------------------------------------
| Id  | Operation            | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |         |     1 |    26 |     1   (0)| 00:00:01 |
|   1 |  SORT GROUP BY NOSORT|         |     1 |    26 |     1   (0)| 00:00:01 |
|*  2 |   INDEX FULL SCAN    | HIST_IX |     1 |    26 |     1   (0)| 00:00:01 |
--------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("HIST_TIMESTAMP"<=TO_TIMESTAMP(:B1))
       filter("HIST_TIMESTAMP"<=TO_TIMESTAMP(:B1))

Note
-----
   - dynamic statistics used: dynamic sampling (level=2)

It is not true that an index can be used only for the leading columns:

explain plan for SELECT /*+ INDEX_SS(hist hist_ix) */ ID, MAX(HIST_TIMESTAMP) FROM hist WHERE HIST_TIMESTAMP <= :B1 GROUP BY ID;
select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 2669193891

--------------------------------------------------------------------------------
| Id  | Operation            | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |         |     1 |    26 |     1   (0)| 00:00:01 |
|   1 |  SORT GROUP BY NOSORT|         |     1 |    26 |     1   (0)| 00:00:01 |
|*  2 |   INDEX SKIP SCAN    | HIST_IX |     1 |    26 |     1   (0)| 00:00:01 |
--------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("HIST_TIMESTAMP"<=TO_TIMESTAMP(:B1))
       filter("HIST_TIMESTAMP"<=TO_TIMESTAMP(:B1))

Note
-----
   - dynamic statistics used: dynamic sampling (level=2)

Another method would be an index fast full scan (INDEX_FFS hint).

If you force the usage of your index with hints, then compare the cost of the plan with full table scan and the plan with index access path. It is simply a cost based decision with a simple example like this.

If you can not even force the usage of your index, I would search the problem somewhere else. For example your index is in UNUSABLE state (check USER_INDEXES.STATUS) or it was made INVISIBLE (USER_INDEXES.VISIBILITY).

Best Answer

Related Solutions

Why isn’t oracle using an index for distinct query

Query against history table not using index (Oracle 12c)

Related Question