How to create index to improve performance of an aggregate function that creates a table in oracle

aggregateoptimizationoracleoracle-apexplsql

I am creating an Oracle ORDS API using APEX_JSON. I recently started using bind variables instead of string concatenation using ||. I am trying to use an in clause in my where condition.

The problems begin here. The field I need to have on the left side of in is a number and the parameter to my stored procedure needs to be varchar2 as it is a comma seperated list of numbers.

Example (edited for brevity)

CREATE OR REPLACE PROCEDURE GET_CATEGORYPRODS (
    PCATEGORYID IN NUMBER,
    COMMASEPPRODUCTIDS IN VARCHAR2
) AS

l_cursor               SYS_REFCURSOR;
v_stmt_str             STRING(5000);
v_name                 NUMBER; --PRODUCT.NAME%TYPE;
v_displayorder         NUMBER; --PRODUCTCATEGORY%TYPE;
BEGIN
 v_stmt_str := 'SELECT 
    P.NAME, 
    PC.DISPLAYORDER
FROM 
    PRODUCT P
INNER JOIN
    PRODUCTCATEGORY PC
ON P.PRODUCTID = PC.PRODUCTID
WHERE 
   PC.CATEGORYID := :CATEGORYID
AND
   (P.PRODUCTID IN (SELECT * FROM TABLE(STRING_TO_TABLE_NUM(:COMMASEPPRODUCTIDS))) -- PREVIOUSLY WHERE || OCCURRED
        OR (:COMMASEPPRODUCTIDS IS NULL))';

s_counter := 0;

OPEN l_cursor FOR v_stmt_str
        USING pcategoryid, commasepproductids, commasepproductids;

FETCH l_cursor INTO
    v_productid,
    v_displayorder;

APEX_JSON.OPEN_ARRAY;
LOOP
    EXIT WHEN l_cursor%notfound;
    apex_json.open_object;
    apex_json.write('ProductID', v_productid);
    apex_json.write('DisplayOrder', v_displayorder);
    apex_json.close_object;
END LOOP;
apex_json.close_all;

END GET_CATEGORYPRODS;

Sample of parameters
'97187,142555,142568,48418,43957,44060,45160,45171,333889,333898'

To handle this problem, I created an aggregate function that takes in a string, splits on the commas, and pipes the row to a custom type.

Custom Type

create or replace type tab_number is table of number;

Aggregate Function

create or replace FUNCTION string_to_table_num (
    p VARCHAR2
)

   RETURN tab_number
   PIPELINED IS
BEGIN
   FOR cc IN (SELECT rtrim(regexp_substr(str, '[^,]*,', 1, level), ',') res
                FROM (SELECT p || ',' str FROM dual)
              CONNECT BY level <= length(str) 
                                  - length(replace(str, ',', ''))) LOOP
      PIPE ROW(lower(cc.res));
   END LOOP;
    
END;

The query slowed down significantly. I figured some optimization was needed but I had never done any sort of optimization before. After some research, I found EXPLAIN PLAN and ran it on the orginal query. I couldn't get a good answer because of the bind variables, so I decided to run it on the aggregate function.

EXPLAIN PLAN QUERIES

explain plan for select * from TABLE(string_to_table_num('97187,142555,142568,48418,43957,44060,45160,45171,333889,333898'));

SELECT * 
FROM   TABLE(DBMS_XPLAN.DISPLAY);

When I ran EXPLAIN PLAN for the aggregate function the results were:

Plan hash value: 127161297
 
---------------------------------------------------------------------------------------------------------
| Id  | Operation                         | Name                | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                  |                     |  8168 | 16336 |    29   (0)| 00:00:01 |
|   1 |  COLLECTION ITERATOR PICKLER FETCH| STRING_TO_TABLE_NUM |  8168 | 16336 |    29   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------------------

As I stated before, I am a noob to analyzing and optimizing queries, but 8168 Rows and 16336 bytes seems to be a lot for such a simple function. I looked into it, and found that the problem may be the lack of indexing of the pipelined table. I tried to add an index to the type tab_number but it turned it into a PL/SQL object that needed to be declared in a query, not a function.

I am pretty lost with this one. If you have any suggestions for any of the scenarios I mentioned, I am all ears. Thanks in advance.

EDIT
After following the steps in Balazs's answer below, I ran EXPLAIN PLAN for both the aggregate function and the select from my procedure.

Aggregate Function

Plan hash value: 229973419
 
------------------------------------------------------------------------------------------------
| Id  | Operation                         | Name       | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                  |            |  8168 | 16336 |    29   (0)| 00:00:01 |
|   1 |  COLLECTION ITERATOR PICKLER FETCH| F_CONVERT2 |  8168 | 16336 |    29   (0)| 00:00:01 |
------------------------------------------------------------------------------------------------

Select Function

Plan hash value: 1690769838
 
-------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                               | Name                                | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                        |                                     |     6 |  2790 |   972   (0)| 00:00:01 |
|*  1 |  FILTER                                 |                                     |       |       |            |          |
|   2 |   NESTED LOOPS OUTER                    |                                     |    65 | 30225 |   972   (0)| 00:00:01 |
|   3 |    NESTED LOOPS OUTER                   |                                     |    65 | 27950 |   842   (0)| 00:00:01 |
|   4 |     NESTED LOOPS                        |                                     |    65 | 27365 |   777   (0)| 00:00:01 |
|   5 |      TABLE ACCESS BY INDEX ROWID BATCHED| PRODUCTCATEGORY                     |    65 |   845 |   712   (0)| 00:00:01 |
|*  6 |       INDEX SKIP SCAN                   | SYS_C0012982                        |    65 |       |   709   (0)| 00:00:01 |
|*  7 |      TABLE ACCESS BY INDEX ROWID        | PRODUCTNEW                          |     1 |   408 |     1   (0)| 00:00:01 |
|*  8 |       INDEX UNIQUE SCAN                 | SYS_C0013161                        |     1 |       |     0   (0)| 00:00:01 |
|*  9 |     INDEX RANGE SCAN                    | SYS_C0012993                        |     1 |     9 |     1   (0)| 00:00:01 |
|* 10 |    INDEX RANGE SCAN                     | IDX_URLMAPPER_PRODUCTID_FRIENDLYURL |     1 |    35 |     2   (0)| 00:00:01 |
|* 11 |   COLLECTION ITERATOR PICKLER FETCH     | F_CONVERT2                          |     1 |     2 |     2   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   1 - filter(:COMMASEPPRODUCTIDS IS NULL OR  EXISTS (SELECT 0 FROM TABLE() "KOKBF$0" WHERE VALUE(KOKBF$)=:B1))
   6 - access("pc"."CATEGORYID"=TO_NUMBER(:PCATEGORYID))
       filter("pc"."CATEGORYID"=TO_NUMBER(:PCATEGORYID))
   7 - filter("p"."PUBLISHED"=1 AND "p"."DELETED"=0)
   8 - access("p"."PRODUCTID"="pc"."PRODUCTID")
   9 - access("p"."PRODUCTID"="pm"."PRODUCTID"(+))
  10 - access("p"."PRODUCTID"="um"."PRODUCTID"(+))
  11 - filter(VALUE(KOKBF$)=:B1)

Do these numbers seem to be within a reasonable range?

EDIT 2

create or replace PROCEDURE GET_CATEGORYPRODS2 
(
  COMMASEPPRODUCTIDS IN VARCHAR2 
, COMMASEPPRODUCTSKUS IN VARCHAR2 
, PCATEGORYID IN NUMBER 
, SORTBY IN VARCHAR2 
)

....

IF sortby IS NULL OR sortby = 'null' OR sortby = '' THEN
        v_sortby := '"p".Discontinued, "pc".DisplayOrder ';
    ELSIF sortby = 'PriceAscending' THEN
        v_sortby := '"p".discontinued, "p".price ';
    ELSIF sortby = 'PriceDescending' THEN
        v_sortby := '"p".discontinued, "p".price DESC ';
    ELSIF sortby = 'Name' THEN
        v_sortby := '"p".discontinued, "p".name ';
    ELSE
        v_sortby := '"p".discontinued, "pc".displayorder ';
    END IF;
....

I was also using a bind variable to add this to the end of the query.

Best Answer

Try it like this:

create or replace type tab_number is table of number;

create or replace FUNCTION string_to_table_num (p VARCHAR2) RETURN tab_number
   res tab_nubmer;
BEGIN

   select regexp_substr(p,'[^,]+', 1, level) 
   bulk collect into res
   from dual
   connect by regexp_substr(p, '[^,]+', 1, level) is not null;

   return res;
END;

and use it similar to this:

   ids tab_nubmer := string_to_table_num(COMMASEPPRODUCTIDS);

begin

   OPEN l_cursor FOR 
   SELECT 
      P.NAME, 
      PC.DISPLAYORDER
   FROM PRODUCT P
      INNER JOIN PRODUCTCATEGORY P ON P.PRODUCTID = PC.PRODUCTID
WHERE PC.CATEGORYID = CATEGORYID
  AND P.PRODUCTID MEMBER of ids;

You don't need dynamic SQL.

An ORDER BY can be done like this:

ORDER BY "p".discontinued,
   CASE sortby 
      WHEN 'PriceAscending' THEN "p".price
      WHEN 'PriceDescending' THEN "p".price DESC
      WHEN 'Name' THEN "p".name
      ELSE "pc".DisplayOrder -- includes 'null' or ''
   END