I'm trying to optimize a view which grabs the most recent history record for an item based on some specific criteria. I wrote a query which ran fairly quickly, and then turned it into a view.
The query looks like the following:
with specs as (
select 't1' report, 'MC' flag, 1 typer union
select 't1', 'CBC', 1 union
select 't1', 'RIC', 1 union
select 't2', 'SMC', 1 union
select 't3', 'IRC', 1 union
select 't3', 'IMC', 1 union
select 't3', 'IAC', 1 union
select 't4', 'ISMC', 1 union
select 't5', 'MC', 1 union
select 't5', 'CBC', 1 union
select 't5', 'RIC', 1 union
select 't5', 'AC', 1 union
select 't5', 'SMC', 1 union
select 't5', 'SAC', 1 union
select 't5', 'IRC', 1 union
select 't5', 'IMC', 1 union
select 't5', 'IAC', 1 union
select 't5', 'ISMC', 1 union
select 't5', 'ISAC', 1 union
select 't6', 'C', 2
) ,
years as (
select YEAR(getdate()) y
union all
select y -1
from years
where y > YEAR(getdate())-3
)
select rank() over (partition by report,PC_P.p_id order by mast.year_id desc, mast.HISTORY_ID) ranks,
y.y filterYear,
mast.YEAR_ID,
mast.HISTORY_ID,
PC_P.P_ID,
PC_P.MAP MAP,
report,
INSP_FLAG,
REASON_CODE
from table1 PC_P WITH(NOLOCK)
Cross Join
years y
inner join table2 mast WITH(NOLOCK)
on mast.P_ID = PC_P.P_ID
and mast.HISTORY_ID <> -32765
and mast.YEAR_ID <=y.y
inner join
table3 ma with(nolock)
on mast.YEAR_ID = ma.YEAR_ID
and mast.HISTORY_ID =ma.HISTORY_ID
and mast.P_ID = ma.P_ID
inner join
specs
on (typer=1 and (flag =INSP_FLAG or REASON_CODE='C')) or (typer=2 and REASON_CODE=flag)
where MAP between 'town-000000000' and 'town-999999999'
and Report = 't1'
and y.y =2019
The view looks like the following:
CREATE view [dbo].[TEST_view] as
with specs as (
select 't1' report, 'MC' flag, 1 typer union
select 't1', 'CBC', 1 union
select 't1', 'RIC', 1 union
select 't2', 'SMC', 1 union
select 't3', 'IRC', 1 union
select 't3', 'IMC', 1 union
select 't3', 'IAC', 1 union
select 't4', 'ISMC', 1 union
select 't5', 'MC', 1 union
select 't5', 'CBC', 1 union
select 't5', 'RIC', 1 union
select 't5', 'AC', 1 union
select 't5', 'SMC', 1 union
select 't5', 'SAC', 1 union
select 't5', 'IRC', 1 union
select 't5', 'IMC', 1 union
select 't5', 'IAC', 1 union
select 't5', 'ISMC', 1 union
select 't5', 'ISAC', 1 union
select 't6', 'C', 2
) ,
years as (
select YEAR(getdate()) y
union all
select y -1
from years
where y > YEAR(getdate())-3
)
select rank() over (partition by report,PC_P.p_id order by mast.year_id desc, mast.HISTORY_ID) ranks,
y.y filterYear,
mast.YEAR_ID,
mast.HISTORY_ID,
PC_P.P_ID,
PC_P.MAP MAP,
report,
INSP_FLAG,
REASON_CODE
from table1 PC_P WITH(NOLOCK)
Cross Join
years y
inner join table2 mast WITH(NOLOCK)
on mast.P_ID = PC_P.P_ID
and mast.HISTORY_ID <> -32765
and mast.YEAR_ID <=y.y
inner join
table3 ma with(nolock)
on mast.YEAR_ID = ma.YEAR_ID
and mast.HISTORY_ID =ma.HISTORY_ID
and mast.P_ID = ma.P_ID
inner join
specs
on (typer=1 and (flag =INSP_FLAG or REASON_CODE='C')) or (typer=2 and REASON_CODE=flag)
I then ran a query on the view with the same where clause as the initial query:
select *
from [TEST_view]
where TAX_MAP between 'town-000000000' and 'town-999999999'
and Report = 't1'
and filterYear = 2019
What I found was that it took ~60 times longer to run the query on the view than the initial query.
I can only guess that it's treating the view as a subquery, and is grabbing all the rows in the db and doing calculations first, and then filtering based on the criteria in the where clause.
I took a quick look at the msdn docs and didn't see a description of how the query optimizer works when filtering a view.
Why is the query on the view so much slower than the standalone query?
Best Answer
I will try to explain again why the ranking function doesn't allow for the filters to be applied on the lowest level possible. Here is a query and the same query but as a view and the filter applied to the view:
You can see that the filter has been pushed as the last step in the execution plan. SQL Server must calculate the ranking function first for the whole dataset and only then it can apply the filter otherwise the result will be wrong.
Also by comparing the result you can see that the queries only look similar but actually provide different results for RN column.