Sql-server – Plans created several times under Forced Parameterization

parameterplan-cachesql server

People use forced parameterization to save plan compilation cost. However, I found things different here.

The query I will run is as following, and under AdventureWorks2012 database:

DECLARE @Var VarChar(250), @SQL nvarChar(MAX)
SET @Var = NEWID()

SET @SQL =
'SELECT SalesOrderHeader.SalesPersonID,
       COUNT(DISTINCT SalesOrderHeader.CustomerID),
       SUM(SalesOrderDetail.OrderQty)
  FROM Sales.SalesOrderHeader
 INNER JOIN Sales.SalesOrderDetail
    ON SalesOrderDetail.SalesOrderID = SalesOrderHeader.SalesOrderID
 INNER JOIN Production.Product
    ON Product.ProductID = SalesOrderDetail.ProductID
 WHERE Product.Name = ' + '''' + @Var + '''
 GROUP BY SalesOrderHeader.SalesPersonID'

EXEC (@SQL)

I configured the database to use forced parameterization, flushed the cache using dbcc freeproccache, and then ran the above query 3 times. I used following query to get the plan use stats:

select  bucketid, usecounts, memory_object_address, cacheobjtype, objtype, plan_handle, text
from (select * from sys.dm_exec_cached_plans ) as cacheplan 
cross apply sys.dm_exec_sql_text(plan_handle)as text
WHERE text.text LIKE'%SalesOrderHeader%.%SalesPersonID%'

And below is the result I got:

enter image description here

Except row 1 are all about the queries I ran. Row 5 I can understand. This means database engine actually parameterized product.name and generate a plan for later use. That's why the usecounts = 3.

What I don't understand is why we still have row 2,3,4? Does that mean forced parameterization couldn't save anytime (or even cost more) when running the same type of queries?

I would really appreciate any suggestions or ideas. Thanks.

Additional Info:

  • I understand we can turn on "Optimized Ad hoc Workloads" option for the instance to save some space. However, I feel these ad hoc plan stubs are not supposed to be cached at all.

  • I know using sp_executesql would make solve above problem. But there are some queries in our production database that cannot be easily changed.

Best Answer

Okay, this is the reason:

From Caching Mechanisms on MSDN:

You should notice that the two individual queries with their distinct constants do get cached as adhoc queries. However, these are only considered shell queries and are only cached to make it easier to find the autoparameterized version of the query if the exact same query with the same constant is reused at a later time. These shell queries do not contain the full execution plan but only a pointer to the full plan in the corresponding prepared plan.

For more in-depth technical information, see:

4.0 Query Parameterization on the SQL Programmability & API Development Team Blog

There are benefits in caching the shell query: If the same query were to be re-executed, then we would compute the hash value of the sql text of the query and find an exact match in the cache i.e. the shell query. Since this shell query points to the compiled plan, the compiled plan is executed and we are done.

If we had not cached this shell query and if the same query was re-executed then the steps followed would be slightly different: first we would compute the hash of the sql text of the query and not find an exact match in the cache. Next, the query is auto-parameterized. Now for this auto-parameterized query we will search the cache and find an exact match in the cache avoiding the need to go to the query optimizer. Finally we execute this compiled plan and are done.

Clearly there are performance gains from caching the shell query, especially for applications that re-execute the same query with the same literal values as well. Note that we do not cache insert shell queries because the probability of re-using the exact same adhoc query is low.

For more information read the whole documents (and series of posts, in the second case).