I'm trying to optimize a query in which I have some requirements where I need to be able to arbitrarily select a row ROWX
from a query. Why is it that
SELECT XML_DATA FROM
(Select ROWNUM AS MYROWNUM , EXTRACT(COL1,'expresion').getStringVal() AS XML_DATA
From "Table1" Where Col2 = 'val' AND ROWNUM < ROWX+1)
WHERE MYROWNUM = ROWX ;
Runs much faster than
SELECT XML_DATA FROM (Select ROWNUM AS MYROWNUM , EXTRACT(COL1,'expresion').getStringVal() AS XML_DATA
From "Table1" Where Col2 = 'val') AND MYROWNUM = ROWX
Obviously as ROWX grows there is some performance degradation with the first query but the second query has a constant run time (outside of the case where ROWX=1) of about 4 minutes where the base runtime for the first query is non-existent and growth is more acceptable than having to wait 4 minutes per iteration.
Version:
Oracle Database 11g Enterprise Edition 11.2.0.3.0 64bit Production
Sorry I'm a software developer (not a DBA!) that primarily works with MySQL/SQL Server so I'm not familiar with Oracle's internal workings feel free to point me in an entirely different direction,
Thanks.
Based on Justin Cave's answer I wrote the query to be deterministic and a bit cleaner (less hacky)
SELECT EXTRACT(COL1,'expresion').getStringVal() AS XML_DATA
FROM (Select ROWNUM AS MYROWNUM , CLUSTERED_ID From "Table1" Where Col2 = 'val' ORDER BY CLUSTERED_ID) t1
JOIN table1 ON (t1.CLUSTERED_ID = table1.CLUSTERED_ID)
WHERE MYROWNUM = ROWX
This has the benifit of running in constant time (where n = rowx rather than count()) and is much easier to understand.
Best Answer
The
rownum
pseudocolumn is assigned as rows are processed. The predicaterownum = <<x>>
will evaluate tofalse
if<<x>>
is anything other than 1.Conceptually (and I emphasize that this isn't how Oracle actually works, it's just a useful abstraction), the second query does something like
table1
col2 = 'val'
expressioncol2 = 'val'
, evaluate theextract
rownum
to the row that has been returned. Since no other rows have satisfied all the criteria yet, therownum
will be 1rownum = <<x>>
predicate. If<<x>>
is anything other than 1, the row is rejectedIn this loop, every row will be read from
table1
. And every row will be rejected (assuming<<x>>
is something other than 1) because of therownum = <<x>>
predicate. So you'll incur the cost of reading every row from the table, evaluating theextract
for every row, and you'll never return any data.Your first query appears to be working because it is filtering out
<<x>> + 1
rows in the subquery. Of course, since there is noORDER BY
, your first query always returns an arbitrary row. It would be entirely legal for Oracle to return the same row for every value of<<x>>
in your first query. It probably won't, of course, but it is entirely possible that you'd get different rows over time, that some rows would never be returned, that other rows would be returned multiple times, etc. If you want a deterministic result (prior to 12.1 which has some simpler syntax options), you'd need to do two levels of nestingFor a pretty thorough discussion of the various issues, take a look through this askTom thread.