This query runs in ~21 seconds (execution plan):
select
a.month
, count(*)
from SubqueryTest a
where a.year = (select max(b.year) from SubqueryTest b)
group by a.month
When the subquery is replaced with a variable, it runs in <1 second (execution plan):
declare @year float
select @year = max(b.year) from SubqueryTest b
select
month
, count(*)
from SubqueryTest where year = @year group by month
Judging from the execution plan, the "select max…" sub-select is run for each of the millions of rows in "SubqueryTest a:, which is why it takes so long.
My question: Since the sub-select is scalar, deterministic and not correlated, why doesn't the query optimizer do what I did in my second example and run the subquery once, store the result, then use it for the main query? I'm sure there's just a hole in my understanding of SQL Server, but I'd really like help filling it – a couple hours with google haven't helped.
The table is just over 1gb with almost 28 million records:
CREATE TABLE SubqueryTest(
[pk_id] [int] IDENTITY(1,1) NOT NULL
, [Year] [float] NULL
, [Month] [float] NULL PRIMARY KEY CLUSTERED ([pk_id] ASC))
CREATE NONCLUSTERED INDEX idxSubqueryTest ON SubqueryTest ([Year] ASC)
Best Answer
The slow plan isn't calculating the
MAX
for each row in the outer query.In fact it never explicitly calculates it at all.
It gives a plan similar to
Slow Plan (Estimated Row Counts)
You have a non covering index on
year asc
so it scans that backwards to get the rows in the first year (shows as a seek because of the implicitIS NOT NULL
predicate).Unfortunately it doesn't seem to differentiate between
TOP 1
andTOP 1 WITH TIES
when estimating row counts.In this case it makes a huge difference. (estimated 2 key lookup vs actual 4,424,803) so you get an inappropriate plan.
Slow Plan (Actual Row Counts)
You could consider adding
month
into the index onyear
either as a key or included column to make the index covering. The benefit of adding it as a secondary key column would be that it could then feed into a stream aggregate without an additional sort (though for only 12 distinct values a hash aggregate would be fine anyway).A non covering index on such a non selective column is really pretty useless for the vast majority of queries. The index is totally ignored by the "fast" plan which ends up doing a parallel scan on the whole table and evaluating the predicate on all 27,445,400 rows (in preference to performing the huge number of lookups).