Fast paginated result, for various filter clauses

execution-planoracleoracle-11g

I have been working on obtaining a paginated result from a table in my database (11g). While I have a query that does work (ie, the result is correct) it does not perform as well as I'd like to and I am trying to improve its efficiency (est. 60 calls per second on that query only).

So first of all, I read What is an Effective Way to Count the Number of Pages in Oracle?, and the article pointed too, but unfortunately it does not discuss the execution plan of the queries presented at all (I'll get to see them come Monday).

Here is the query I had come up with (the table is partitioned on part_code which is a date range):

select <all-columns> from (
    select rownum as rnum, <all-columns> from (
        select /*+index (my_table index)*/ <all-columns> from my_table
        where part_code in (?, ?, ?)
          and date between ? and ?
          and type in (?, ?, ?)
          and func_col1 = ?
          /*potentially: and func_col2 = ? and func_col3 = ? ... */
        order by date, type, id
     )
) where rnum between M and N; /* N-M ~= 30 */

Note: most of the queries will be carried out with a single func_xxx filter, and I think M will be small… but have no guarantee and in theory M can be up to 9999.

Note: ~72 partitions in total, but only ~39 active at most, ~300,000 different values of func_col1 (but with some having ~50,000 rows and some having 1 row), ~10 values of type, id is unique (generated through a sequence).

It does work, but there is a nasty surprise in the execution plan: the 2nd query (with rownum as rnum) is completely executed, fetching up to ~50,000 rows from the DB, before the pagination kicks in. Fetching ~50,000 rows from the DB to return some ~30 of them seems particularly inefficient.

In the execution-plan this shows up as:

View
\_ Filter (rnum)
   \_ View <= here comes the trouble
      \_ Sort
         \_ ...

I can create an index over the table if necessary, and I can transform the existing partitioned index (part_code, func_col1, date, type, id). The index has exactly the order required by my order by clause; but it does not seem to be sufficient (and removing the order by clause does not get me any closer anyway).

Is there a way to prevent the "materialization" of the view and have Oracle build it on the fly as the outer query needs more data (ie, moving to lazy evaluation of the inner queries) ?

Best Answer

I believe you should be aiming for a plan that avoid any actual sort operation, and "stops short" as soon as possible.

To avoid the sort (and "materializing" the inner view), your sort order must match exactly the index columns, or your where clauses must be strict equals only on all the leading columns. Otherwise there will be a need to sort subsets, and that will require evaluating the whole inner view.

Here's an example with a hypothetical foo table:

create table foo (a number, b number, c varchar(100));
-- fill with dummy data
exec dbms_stats.gather_table_stats(ownname => user, tabname => 'FOO');
create index foo_ix on foo(a, b, c);

Closest simple equivalent to your select:

select * from (
  select rownum r, i.* from (
    select a, b, c
    from foo
    where a in (3,4) and b = 3
    order by c
  )  i
) where r between 5 and 10;

The explain plan isn't good:

--------------------------------------------------------------------------------
| Id  | Operation             | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |        |   163 | 14833 |     5  (20)| 00:00:01 |
|*  1 |  VIEW                 |        |   163 | 14833 |     5  (20)| 00:00:01 |
|   2 |   COUNT               |        |       |       |            |          |
|   3 |    VIEW               |        |   163 | 12714 |     5  (20)| 00:00:01 |
|   4 |     SORT ORDER BY     |        |   163 |  9291 |     5  (20)| 00:00:01 |
|   5 |      INLIST ITERATOR  |        |       |       |            |          |
|*  6 |       INDEX RANGE SCAN| FOO_IX |   163 |  9291 |     4   (0)| 00:00:01 |
--------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("R"<=10 AND "R">=5)
   6 - access(("A"=3 OR "A"=4) AND "B"=3)

The count is too late, and isn't actually (I think) "stopping things short" in this case.

Add the index columns to the order (might not meet your requirements, sorry):

select * from (
  select rownum r, i.* from (
    select a, b, c
    from foo
    where a in (3,4) and b = 3
    order by a, b, c
  )  i
) where r between 5 and 10;
-------------------------------------------------------------------------------
| Id  | Operation            | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |        |   163 | 14833 |     4   (0)| 00:00:01 |
|*  1 |  VIEW                |        |   163 | 14833 |     4   (0)| 00:00:01 |
|   2 |   COUNT              |        |       |       |            |          |
|   3 |    VIEW              |        |   163 | 12714 |     4   (0)| 00:00:01 |
|   4 |     INLIST ITERATOR  |        |       |       |            |          |
|*  5 |      INDEX RANGE SCAN| FOO_IX |   163 |  9291 |     4   (0)| 00:00:01 |
-------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("R"<=10 AND "R">=5)
   5 - access(("A"=3 OR "A"=4) AND "B"=3)

At least we got rid of the sort. Now let's try and "stop short":

select * from (
  select rownum r, i.* from (
    select a, b, c
    from foo
    where a in (3,4) and b = 3
    order by a, b, c
  )  i where rownum < 10
) where r > 5;
-------------------------------------------------------------------------------
| Id  | Operation            | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |        |     9 |   819 |     4   (0)| 00:00:01 |
|*  1 |  VIEW                |        |     9 |   819 |     4   (0)| 00:00:01 |
|*  2 |   COUNT STOPKEY      |        |       |       |            |          |
|   3 |    VIEW              |        |     9 |   702 |     4   (0)| 00:00:01 |
|   4 |     INLIST ITERATOR  |        |       |       |            |          |
|*  5 |      INDEX RANGE SCAN| FOO_IX |     9 |   513 |     4   (0)| 00:00:01 |
-------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("R">5)
   2 - filter(ROWNUM<10)
   5 - access(("A"=3 OR "A"=4) AND "B"=3)

This might be enough for your query: notice the count stopkey (rownum < magic, doesn't kick in with between in what I've seen) and the "rows" column - the inner scan doesn't have to bother about fetching more rows after it's found N.

With the above, you'll still be reading N rows from the table.

You could limit that if all the search criteria are indexed: do the above filtering but retrieving only the ROWID from each match rather than all the columns, and then access the table by ROWID.

select * from foo where rowid in (
 select rid from (
    select rownum r, i.* from (
      select rowid rid
      from foo
      where a in (3,4) and b = 3
      order by a, b, c
    )  i where rownum < 10
  ) where r > 5
);
----------------------------------------------------------------------------------------
| Id  | Operation                   | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |          |     1 |    78 |     6  (17)| 00:00:01 |
|   1 |  NESTED LOOPS               |          |     1 |    78 |     6  (17)| 00:00:01 |
|   2 |   VIEW                      | VW_NSO_1 |     9 |   108 |     4   (0)| 00:00:01 |
|   3 |    HASH UNIQUE              |          |     1 |   225 |            |          |
|*  4 |     VIEW                    |          |     9 |   225 |     4   (0)| 00:00:01 |
|*  5 |      COUNT STOPKEY          |          |       |       |            |          |
|   6 |       VIEW                  |          |     9 |   108 |     4   (0)| 00:00:01 |
|   7 |        INLIST ITERATOR      |          |       |       |            |          |
|*  8 |         INDEX RANGE SCAN    | FOO_IX   |     9 |   594 |     4   (0)| 00:00:01 |
|   9 |   TABLE ACCESS BY USER ROWID| FOO      |     1 |    66 |     1   (0)| 00:00:01 |
----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   4 - filter("R">5)
   5 - filter(ROWNUM<10)
   8 - access(("A"=3 OR "A"=4) AND "B"=3)

This does not work if any of the search fields are not in the index. And make sure you trace this with real data & usual search criteria to see if it materially makes a difference - it might actually be worse especially for low values of M (which is probably the case you want to be the fastest) or high values of N-M.