Oracle – Unexpected CASE Evaluation Logic

oracle

I always understood that the CASE statement worked on a 'short-circuit' principle in that evaluation of subsequent steps does not take place if a prior step is evaluated to true. (This answer Does SQL Server CASE statement evaluate all conditions or exit on first TRUE condition? is related but doesn't appear to cover this situation and relates to SQL Server).

In the following example, I wish to calculate the MAX(amount) between a range of months that differs based on how many months are between the start and paid dates.

(This is obviously a constructed example but the logic has valid business reasoning in the actual code where I see the issue).

If there are < 5 months between the start and paid dates then Expression 1 will be used otherwise Expression 2 will be used.

This results in the error "ORA-01428: argument '-1' is out of range" because 1 record has an invalid data condition that results in a negative value for the start of the BETWEEN clause of the ORDER BY.

Query 1

SELECT ref_no,
       CASE WHEN MONTHS_BETWEEN(paid_date, start_date) < 5 THEN
-- Expression 1
          MAX(amount)
             OVER (PARTITION BY ref_no ORDER BY paid_date ASC 
             ROWS BETWEEN MONTHS_BETWEEN(paid_date, start_date) PRECEDING
             AND CURRENT ROW)
       ELSE
-- Expression 2
           MAX(amount)
             OVER (PARTITION BY ref_no ORDER BY paid_date ASC 
             ROWS BETWEEN 5 PRECEDING AND CURRENT ROW)
       END                
    END 
  FROM payment

So I went for this second query to first eliminate anywhere this can occur:

SELECT ref_no,
       CASE WHEN MONTHS_BETWEEN(paid_date, start_date) < 0 THEN 0
       ELSE
          CASE WHEN MONTHS_BETWEEN(paid_date, start_date) < 5 THEN
             MAX(amount)
                OVER (PARTITION BY ref_no ORDER BY paid_date ASC 
                ROWS BETWEEN MONTHS_BETWEEN(paid_date, start_date) PRECEDING 
                AND CURRENT ROW)
          ELSE
             MAX(amount)
                OVER (PARTITION BY ref_no ORDER BY paid_date ASC 
                ROWS BETWEEN 5 PRECEDING AND CURRENT ROW)
          END                
       END
  FROM payment

Unfortunately, there is some unexpected behaviour that means that the values Expression 1 WOULD use are validated, even though the statement will not be executed because the negative condition is now trapped by the outer CASE.

I can get around the issue by using ABS on the MONTHS_BETWEEN in Expression 1, but I feel like this should be unnecessary.

Is this behaviour as expected ? If so 'why' as it seems illogical to me and more like a bug ?


This will create a table and test data. The query is simply me checking that the correct path in the CASE is being taken.

CREATE TABLE payment
(ref_no NUMBER,
 start_date DATE,
 paid_date  DATE,
 amount  NUMBER)
 
INSERT INTO payment
VALUES (1001,TO_DATE('01-11-2015','DD-MM-YYYY'),TO_DATE('01-01-2016','DD-MM-YYYY'),3000)

INSERT INTO payment
VALUES (1001,TO_DATE('01-11-2015','DD-MM-YYYY'),TO_DATE('12-12-2015','DD-MM-YYYY'),5000)

INSERT INTO payment
VALUES (1001,TO_DATE('10-03-2016','DD-MM-YYYY'),TO_DATE('10-02-2016','DD-MM-YYYY'),2000)

INSERT INTO payment
VALUES (1001,TO_DATE('01-11-2015','DD-MM-YYYY'),TO_DATE('03-03-2016','DD-MM-YYYY'),6000)

INSERT INTO payment
VALUES (1001,TO_DATE('01-11-2015','DD-MM-YYYY'),TO_DATE('28-11-2015','DD-MM-YYYY'),10000)

SELECT ref_no,
       CASE WHEN MONTHS_BETWEEN(paid_date, start_date) < 0 THEN '<0'
       ELSE
          CASE WHEN MONTHS_BETWEEN(paid_date, start_date) < 5 THEN
             '<5'
         --    MAX(amount)
         --       OVER (PARTITION BY ref_no ORDER BY paid_date ASC ROWS
         --       BETWEEN MONTHS_BETWEEN(paid_date, start_date) PRECEDING
         --       AND CURRENT ROW)
          ELSE
             '>=5'
         --    MAX(amount)
         --       OVER (PARTITION BY ref_no ORDER BY paid_date ASC ROWS
         --       BETWEEN 5 PRECEDING AND CURRENT ROW)
          END                
       END
  FROM payment

Best Answer

So it was hard for me to determine what your actual question was from the post, but I assume it is that when you execute:

SELECT ref_no,
   CASE WHEN MONTHS_BETWEEN(paid_date, start_date) < 0 THEN 0
   ELSE
      CASE WHEN MONTHS_BETWEEN(paid_date, start_date) < 5 THEN
         MAX(amount)
            OVER (PARTITION BY ref_no ORDER BY paid_date ASC 
            ROWS BETWEEN MONTHS_BETWEEN(paid_date, start_date) PRECEDING 
            AND CURRENT ROW)
      ELSE
         MAX(amount)
            OVER (PARTITION BY ref_no ORDER BY paid_date ASC 
            ROWS BETWEEN 5 PRECEDING AND CURRENT ROW)
      END                
   END
FROM payment

You still get ORA-01428: argument '-1' is out of range?

I don't think this is a bug. I think it is an order of operation thing. Oracle needs to do the analytics on all of the rows returned by the resultset. Then it can get down to the nitty gritty of transforming the output.

A couple of additional ways around this would be to exclude the row with a where clause:

SELECT ref_no,
   CASE WHEN MONTHS_BETWEEN(paid_date, start_date) < 5 THEN
   -- Expression 1
      MAX(amount)
         OVER (PARTITION BY ref_no ORDER BY paid_date ASC 
         ROWS BETWEEN MONTHS_BETWEEN(paid_date, start_date) PRECEDING
         AND CURRENT ROW)
   ELSE
   -- Expression 2
       MAX(amount)
         OVER (PARTITION BY ref_no ORDER BY paid_date ASC 
         ROWS BETWEEN 5 PRECEDING AND CURRENT ROW)
   END                
END 
FROM payment
-- this excludes the row from being processed
where MONTHS_BETWEEN(paid_date, start_date) > 0 

Or you could embed a case in your analytic like:

SELECT ref_no,
   CASE WHEN MONTHS_BETWEEN(paid_date, start_date) < 5 THEN
-- Expression 1
      MAX(amount)
         OVER (PARTITION BY ref_no ORDER BY paid_date ASC 
               ROWS BETWEEN 
               -- This case will be evaluated when the analytic is evaluated
               CASE WHEN MONTHS_BETWEEN(paid_date, start_date) < 0 
                THEN 0 
                ELSE MONTHS_BETWEEN(paid_date, start_date) 
                END 
              PRECEDING
              AND CURRENT ROW)
   ELSE
-- Expression 2
       MAX(amount)
         OVER (PARTITION BY ref_no ORDER BY paid_date ASC 
         ROWS BETWEEN 5 PRECEDING AND CURRENT ROW)
   END                
END 
FROM payment

Explanation

I wish I could find some documentation to back up the order of operation thing, but I haven't been able to find anything...yet.

The CASE short-circuit evaluation happens after the analytic function is evaluated. The order of operations for the query in question would be:

  1. from payment
  2. max over()
  3. case.

So since the max over() happens before the case, the query fails.

Oracle's analytic functions would be considered a row source. If you execute an explain plan on your query, you should see a "window sort" which is the analytic, generating rows, which are fed to it by the previous row source, the payment table. A case statement is an expression that is evaluated for each row in the row source. So it makes sense (to me at least), that the case happens after the analytic.