(Expanding on my comment on the question.)
Without a unique constraint on the combination of AreaId
and ParameterTypeId
, the given code is broken because @UpdatedId = target.Id
will only ever record a single row Id
.
Unless you tell it so, SQL Server can't implicitly know the possible states of the data. Either the constraint should be enforced, or if multiple rows are valid, the code will need to be changed to use a different mechanism to output the Id
values.
Because of the possibility that the scan operator will come across multiple matching rows, the query must eager spool all the matches for Halloween protection. As indicated in the comments, the constraint is valid, so adding it will not only change the plan from a scan to a seek, but also eliminate the need for the table spool, as SQL Server will know there is going to be either 0 or 1 rows returned from the seek operator.
Sometimes I wonder if SHORT scripts really is the best thing to focus on.
The size of a script has little to do with how efficiently the query will execute. A more compact statement will likely consume fewer resources in terms of compilation, but (re)compilation is usually a rare occurrence in a live system.
Fewer table accesses is usually desirable, though, and this does lead to more compact code.
Very generally speaking, a smaller execution plan will yield better results, and a lower estimated cost will yield better results. Again, though, it's highly situational. Cost estimates in particular can be way off in some cases. It's important to measure the actual execution time, because at the end of the day, that's what matters.
With left joins i can achieve what i want with just a few lines. But then I tried with a longer script, using unions. Which is the best method?
First of all, we need to know how much data will be in these tables in a real system. Right now there's so little it will be difficult to use the STATISTICS TIME
performance metrics to figure out a winner -- the results that come back will be dominated by factors other than the query execution. With more data, it's likely the plans will change, thus rendering the comparison here moot.
Having said that, by looking at the query plans as they are now from a logical point of view, the first one is the winner.
You can see that the Clustered Index Scan of quantities
appears once in the first plan, while it appears four times in the second one. The second plan also contains an expensive Distinct Sort as a result of using UNION
s (this operator could be eliminated by using UNION ALL
s instead, which won't change the results).
The first query could also probably be improved, by getting index seeks on the colors
and sizes
tables, instead of table scans. It might be worth trying a hash match plan as well (which is what you'll probably see when quantities
and products
are larger), but for tables this small, the startup cost may be too much overhead to be of benefit.
What I would suggest you do is run each of the statements you want to test 10,000+ times in a loop, figure out the average execution time, and then compare.
Best Answer
The difference in logical reads leads me to believe that the two executions are getting different plans. This could be because:
exec dbo.SM_SP...
).You can see if there are two or more different plans for the procedure using something like this, which will also identify if this is caused by either (or both) of the two most likely factors for different plan choice: