SQL Server Execution Plan – TOP Operator Without TOP or ORDER BY

execution-plansql serversql-server-2012top

enter image description hereI have product sales by month temporary table.(subset of much larger historical table) I'm running query to find new product not previously sold in the previous months for a specified 6 month range. So take January as base, in Feb show me all new product sold that wasn't in Jan sales. In March all new product not sold in Jan or Feb sales, etc. For first 5 months query returns sub second. On 6th month, it takes ~5 minutes. Comparing execution plans, on that 6th month, it's showing Top operator in my execution plan that's taking the extra time. But I'm not sorting or using Top X in select. What in my query is causing this? I can wait 5 mins since it's one-off request, but I'm more asking from a learning perspective, I want to understand the why it added to Top operator. Thanks in advance.

select divn_nbr, dept_nbr, vendorNumber, pid, nrfcolorNumber, periodcode    
from ##tmpSFA a 
where periodCode = '201512' 
and not exists  
(   
    select 1 from ##tmpSFA b
    where ISNULL(a.divn_nbr, -99) = isnull(b.divn_nbr, -99)
    and isnull(a.dept_nbr, -99) = isnull(b.dept_nbr, -99)
    and isnull(a.vendorNumber, -99) = isnull(b.vendorNumber, -99)
    and isnull(LTRIM(rtrim(upper(a.pid))), -99) = isnull(LTRIM(rtrim(upper(b.pid))), -99)
    and isnull(a.nrfcolorNumber, -99) = isnull(b.nrfColorNumber, -99)
    and b.periodCode IN ('201507', '201508', '201509', '201510', '201511')
)   

Here are the results and plan:

enter image description here
enter image description here

Best Answer

NOT EXISTS can introduce a TOP 1 (as can EXCEPT in some cases). This is because it only cares about the fact that at least one row exists. This is called short circuiting, and you want this - it is much more likely to speed up your query than slow it down. And I'd be really surprised if the TOP didn't show up in the plan for different date ranges, unless there was a substantial difference in row counts and a different plan was compiled. Maybe you could post the two .sqlplan files somewhere so that we could associate each plan with the query and help determine the reasons for any differences. With a screen shot and the query that allegedly goes with it, there's not a ton to go on.

I'm not convinced that the TOP in your case is making this query run for 5 minutes. Rather than just look at the plan, I'd actually investigate what the query is doing for 5 minutes, while it is running, by looking at sys.dm_exec_requests - is it blocking? What is the wait type? Since you're using a ##global temp table (why?), this could be due to blocking, but it's unclear from just a screen shot of a plan. Why doesn't your ##temp table have a clustered index? What other indexes does it have? The TOP would probably be more efficient if it could sort on something sensible other than the output of a full table scan. You could consider changing the subquery to be TOP (1) ... ORDER BY but adding a clustered index (or an index that covers and satisfies all the where clauses in the subquery) should accomplish something quite similar.

For more info on NOT EXISTS and alternatives, see this blog post.