I have two queries:
1) This query has a ROWNUM column (takes 20 seconds to execute):
SELECT
ROWNUM
,ROAD_ID
,VERTEX_INDEX
,SDE.ST_X(ST_POINT) AS X
,SDE.ST_Y(ST_POINT) AS Y
FROM
(
SELECT
ROWNUM
,a.ROAD_ID
,b.NUMBERS VERTEX_INDEX
,SDE.ST_PointN(a.SHAPE, b.NUMBERS) AS ST_POINT
FROM ENG.ROAD a
CROSS JOIN ENG.NUMBERS b
WHERE b.NUMBERS <= SDE.ST_NUMPOINTS(a.SHAPE)
)
--removed to do explain plan: ORDER BY ROAD_ID, VERTEX_INDEX
-------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
-------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 5996 | 322K| | 262 (1)| 00:00:01 |
| 1 | COUNT | | | | | | |
| 2 | VIEW | | 5996 | 322K| | 262 (1)| 00:00:01 |
| 3 | COUNT | | | | | | |
| 4 | MERGE JOIN | | 5996 | 1545K| | 262 (1)| 00:00:01 |
| 5 | INDEX FULL SCAN | R23715_SDE_ROWID_UK | 30 | 90 | | 1 (0)| 00:00:01 |
|* 6 | SORT JOIN | | 3997 | 1018K| 2392K| 261 (1)| 00:00:01 |
| 7 | TABLE ACCESS FULL| ROAD | 3997 | 1018K| | 34 (0)| 00:00:01 |
-------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
" 6 - access(""B"".""NUMBERS""<=""SDE"".""ST_NUMPOINTS""(""A"".""SHAPE""))"
" filter(""B"".""NUMBERS""<=""SDE"".""ST_NUMPOINTS""(""A"".""SHAPE""))"
2) This query does not have a ROWNUM column (takes 40 seconds to execute):
SELECT
RDSEC
,VERTEX_INDEX
,SDE.ST_X(ST_POINT) AS X
,SDE.ST_Y(ST_POINT) AS Y
FROM
(
SELECT
a.RDSEC
,b.NUMBERS VERTEX_INDEX
,SDE.ST_PointN(a.SHAPE, b.NUMBERS) AS ST_POINT
FROM INFRASTR.STRLN_ROUTE_SIMPLIFY a
CROSS JOIN INFRASTR.NUMBERS b
WHERE b.NUMBERS <= SDE.ST_NUMPOINTS(a.SHAPE)
)
--removed to do explain plan: ORDER BY RDSEC, VERTEX_INDEX
----------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
----------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 5996 | 1545K| | 262 (1)| 00:00:01 |
| 1 | MERGE JOIN | | 5996 | 1545K| | 262 (1)| 00:00:01 |
| 2 | INDEX FULL SCAN | R23715_SDE_ROWID_UK | 30 | 90 | | 1 (0)| 00:00:01 |
|* 3 | SORT JOIN | | 3997 | 1018K| 2392K| 261 (1)| 00:00:01 |
| 4 | TABLE ACCESS FULL| ROAD | 3997 | 1018K| | 34 (0)| 00:00:01 |
----------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
" 3 - access(""B"".""NUMBERS""<=""SDE"".""ST_NUMPOINTS""(""A"".""SHAPE""))"
" filter(""B"".""NUMBERS""<=""SDE"".""ST_NUMPOINTS""(""A"".""SHAPE""))"
Why does adding ROWNUM in the first query improve performance? I would have thought it would have slowed things down, if anything.
There is some background info on the query here.
Update:
I had incorrectly assumed that I was unable to run explain plans. I'm using mere GIS software to run the query; not SQLplus (or any other proper database software).
I have been told otherwise; I'm now using DBMS_XPLAN to run explain plans.
Best Answer
Posting the query plan for the two versions of the query (and the exact SQL for the other version just for clarity's sake) would certainly be helpful. I would guess that doing so would prove out the following theory. Without it, though, I can guess what is likely going on but I can't be sure.
In general, the database is free to evaluate the elements of a query in whatever order it expects to be most efficient. That may mean that it executes the inline view first in its entirety and then applies the outer projection which would include calling the
SDE.ST_X
andSDE.ST_Y
functions. Or it may mean that it transforms the query so that the outer functions get called for every row and that outer predicates, if any, get pushed into the inline view.If you add a
rownum
to the inline view, in any current version of Oracle, you prevent the optimizer from pushing logic into the inline view since the optimizer can't be sure that it won't impact the results that are returned. In theory, some version of the future optimizer might be smart enough to figure out that evaluating theSDE.ST_X
andSDE.ST_Y
functions in the inline view wouldn't actually change therownum
that is generated but that pushing theorder by
could change the result so some transforms would be allowed and others are rejected. But in the present, adding arownum
acts to basically force the optimizer's hand to leave the inline view "as is".Most likely, the optimizer's estimate for the cost and the selectivity of the various function calls are off. If the
b.NUMBERS <= SDE.ST_NUMPOINTS(a.SHAPE)
function is less expensive than the optimizer expects and filters out much more data than the optimizer expects, it may decide that it is more efficient to callSDE.ST_X
andSDE.ST_Y
on every row rather than filtering all the data first and then taking a second pass through it to call the functions and do the sorting. Ideally, you'd give the optimizer better statistics about the various functions so that it could come up with the more efficient plan without you needing to add arownum
(particularly since that trick might suddenly stop working at some point in the future) but I've certainly been known to add arownum
like this a time or two (along with a comment explaining why) if I needed a quick-and-dirty fix.