I'm a little bit confused about how query optimizer eliminates unnecessary columns from view's query when you need to select only specific one.
Here is my view:
CREATE VIEW Schema1.Object1
AS
SELECT Object2.Column1 AS Column2,
Object2.Column3 AS Column4,
Object3.Column5,
Object3.Column6,
Object4.Column1 AS Column7,
Object5.Column1 AS Column8,
Object5.Column9 AS Column10,
Object5.Column11 AS Column12,
Object5.Column13 AS Column14,
Object5.Column15 AS Column16,
Object5.Column17 AS Column18,
Object5.Column19 AS Column20,
Object5.Column21 AS Column22,
Object5.Column23 AS Column24,
CASE
WHEN Object2.Column25 >= Object5.Column25
THEN Object2.Column25
ELSE Object5.Column25
END AS Column25,
Object2.Column26
FROM Schema1.Object6 AS Object2
CROSS JOIN Schema2.Object7 AS Object4
JOIN Schema1.Object8 AS Object3
ON Object3.Column5 = Object2.Column5
AND Object3.Column7 = Object4.Column1
LEFT JOIN Schema1.Object9 AS Object5
ON Object5.Column4 = Object2.Column3
JOIN Schema3.Object10 AS Object11
ON Object5.Column26 = Object11.Column27
Now, I just want to do a query like:
SELECT Column2
FROM Schema1.Object1
…but the estimated plan includes all columns from a view even though the query doesn't return them.
Why has this happened? Can I avoid this?
Here is a link for query plan: https://1drv.ms/u/s!AhdjYi359YDTgYF6sLo8fBsj5H6Ilg
also available at https://www.brentozar.com/pastetheplan/?id=S1MhIB3Ff
Best Answer
Each node in the plan only projects the columns minimally needed to satisfy the query correctly. You can see this by looking at the Output List of each operator. For example, the final join only lists one column. SQL Server is very good at removing unneeded projections.
I think perhaps you are expecting one or more of the joins to be eliminated. This is more tricky, because the optimizer has to be careful to not change the meaning of the query. There are four reasons to keep a join as Rob Farley describes in JOIN simplification in SQL Server:
Extra columns. The join is needed to provide columns needed by the query, either because the column appears in the final result, or it is needed for some intermediate step (like filtering, or a different join).
Row duplication. A join can increase the number of rows matched. For example a single row in table A might join with two rows in table B, so the result would contain two copies of the information in the table A row.
Row elimination. An inner join can eliminate rows from table A that do not join with any row in table B.
Added NULLs. A right or full join can introduce new NULLs where a row in table A does not match a row in table B.
SQL Server is quite good (though not perfect) at removing unnecessary joins where it is safe to do so. A join can only be removed if the optimizer has a guarantee that none of the four join effects above will affect the result.
In your case, it is likely that item #2 and #3 above applies. You may be able to make the view more simplifiable by using left joins instead of inner joins, and perhaps adding a
DISTINCT
to your outer query. See Rob's article for examples.