Oracle extract slow function call from a WHERE clause

functionsoracle

I have an oracle table in which dates are stored as the number of minutes since a specific date. It looks something like this:

CREATE TABLE MY_TABLE
(
  SOMETHING     NUMBER(15,1) NOT NULL,
  START_MINUTE  NUMBER(15,1) NOT NULL,
  STOP_MINUTE   NUMBER(15,1) NOT NULL
);

I've written two functions to convert between what are known as "MINUTE"s and the actual date/times that they represent. They look like this:

CREATE OR REPLACE FUNCTION FROM_MINUTE (MINUTE_IN IN NUMBER)
RETURN DATE AS
BEGIN
  /* Minute 0 = 12/30/1899 12:00am */
  RETURN (TO_DATE('1899-12-30', 'YYYY-MM-DD') + (MINUTE_IN / 1440));
END FROM_MINUTE;

CREATE OR REPLACE FUNCTION TO_MINUTE (DATE_IN IN DATE)
RETURN NUMBER AS
BEGIN
  /* Minute 0 = 12/30/1899 12:00am */
  RETURN
    (TRUNC(DATE_IN, 'DD') - TO_DATE('12/30/1899', 'MM/DD/YYYY')) * 1440 +
    TO_NUMBER(TO_CHAR(DATE_IN, 'HH24'))  * 60 +
    TO_NUMBER(TO_CHAR(DATE_IN, 'MI'));
END TO_MINUTE;

Now I'm trying to write a query that returns all rows that overlap with a specific time frame. It's taking quite a long time to run, presumably because of the function calls. I've tried several different versions of it:

-- This compares dates by converting the minutes to dates first
SELECT * FROM MY_TABLE
WHERE FROM_MINUTE(START_MINUTE) < TO_DATE('2013-01-31', 'YYYY-MM-DD')
AND FROM_MINUTE(STOP_MINUTE) > TO_DATE('2013-01-01', 'YYYY-MM-DD');

-- This compares minutes by converting the dates to minutes first
SELECT * FROM MY_TABLE
WHERE START_MINUTE < TO_MINUTE(TO_DATE('2013-01-31', 'YYYY-MM-DD'))
AND STOP_MINUTE > TO_MINUTE(TO_DATE('2013-01-01', 'YYYY-MM-DD'));

-- Finally I just got the minute values in a separate query...
SELECT 
  TO_MINUTE(TO_DATE('2013-01-01', 'YYYY-MM-DD')) AS EARLIEST, -- Returns 59436000
  TO_MINUTE(TO_DATE('2013-01-31', 'YYYY-MM-DD')) AS LATEST    -- Returns 59479200
FROM DUAL;
-- and I plugged them directly into the query
SELECT * FROM MY_TABLE
WHERE START_MINUTE < 59479200
AND STOP_MINUTE > 59436000;

I'm pretty sure the problem is that the comparisons in the first two queries require a function call on every row to do a proper comparison. That's why the final query is so much faster, since it's just a direct comparison.

I would prefer to have a single query that runs as fast as the two parter above. Is there a way to force a query to run the function call just once and use the returned value for every subsequent comparison without having to be called for each row? I tried something like the following, but it didn't seem to work like I'd hoped:

SELECT * FROM MY_TABLE,
  (SELECT 
    TO_MINUTE(TO_DATE('2013-01-01', 'YYYY-MM-DD')) AS EARLIEST, -- Returns 59436000
    TO_MINUTE(TO_DATE('2013-01-31', 'YYYY-MM-DD')) AS LATEST    -- Returns 59479200
  FROM DUAL) MY_MINUTES
WHERE START_MINUTE < MY_MINUES.LATEST
AND STOP_MINUTE > MY_MINUTES.EARLIEST;

Best Answer

The second form of your query will work if you make your mark your function as deterministic, meaning that for a given set of input values, it will always return the same result.
With that set, Oracle will only run the conversion once for each parameters in the where clause rather than for every row.

With this:

CREATE OR REPLACE FUNCTION TO_MINUTE_D (DATE_IN IN DATE)
RETURN NUMBER DETERMINISTIC AS
BEGIN
  /* Minute 0 = 12/30/1899 12:00am */
  RETURN
    (TRUNC(DATE_IN, 'DD') - TO_DATE('12/30/1899', 'MM/DD/YYYY')) * 1440 +
    TO_NUMBER(TO_CHAR(DATE_IN, 'HH24'))  * 60 +
    TO_NUMBER(TO_CHAR(DATE_IN, 'MI'));
END TO_MINUTE_D;
/

On a table filled with a large bunch of dummy rows (increasing ints), I get the following timings consistently:

SQL> SELECT * FROM MY_TABLE
WHERE START_MINUTE < TO_MINUTE(TO_DATE('2013-01-31', 'YYYY-MM-DD'))
AND STOP_MINUTE > TO_MINUTE(TO_DATE('2013-01-01', 'YYYY-MM-DD'));

no rows selected

Elapsed: 00:00:12.69

versus deterministic-annotated function:

SQL> SELECT * FROM MY_TABLE
     WHERE START_MINUTE < TO_MINUTE_D(TO_DATE('2013-01-31', 'YYYY-MM-DD'))
     AND STOP_MINUTE > TO_MINUTE_D(TO_DATE('2013-01-01', 'YYYY-MM-DD')); 

no rows selected

Elapsed: 00:00:00.07

You should get very close to what you have with the values plugged in directly, and indexes on those columns can be used as if you'd plugged in literals.

(Putting the conversion function on the start|stop_minute columns isn't a good idea in general as you've discovered, unless you have a function-based index on those that matches exactly.)