SQL Server – Should Query Plans Be Split by Statement for Reusability?

execution-planplan-cachesql server

From my limited knowledge of how query plans are compiled, stored and retrieved by queries I understand that a multi statement query or stored procedure will generate it's query plan which will be stored in the query plan cache to be used by the query in future executions.

I think this plan is retrieved from the query plan cache with the query hash, which means if the query is edited and executed the hash is different and a new plan is generated as no matching hash can be found in the query plan cache.

My question is: If a user executes a statement that is one of the statements in the multi-statement query can it use that relevant part of the query plan already in the cache for the multi-statement query? I expect the answer is no because the hash values will obviously not match, but would it be better to hash each statement in a multi-statement query so they could be used by users running individual statements from the query?

I expect there are complications that I'm not taking into account (And it's these that I really want to know about) but it seems like we could be storing the same 'statement plan' in many query plans taking up more space and taking more CPU and time to generate.

Could just be showing my ignorance though.

Best Answer

If a user executes a statement that is one of the statements in the multi-statement query can it use that relevant part of the query plan already in the cache for the multi-statement query?

No. The basic unit of plan reuse in SQL Server is the batch.

Would it be better to hash each statement in a multi-statement query so they could be used by users running individual statements from the query?

A system tuned for high levels of plan reuse will place common code (at a suitable granularity) in reusable objects (e.g. procedures, functions, triggers) on the SQL Server. It will also explicitly parameterize any application-generated or client-side code. For maximum plan reuse, these generated batches should differ only in parameter values.

I expect there are complications that I'm not taking into account

It sounds like you are asking why SQL Server was designed to cache and reuse at the batch level, rather than at the statement level. I doubt anyone aside from the original designers could answer this question authoritatively. Anyway, it seems to me that a batch is a natural granularity to use because it is relatively self-contained natural unit of work, and represents a reasonable trade-off between implementation complexity and plan reuse probability.

There are a few things that make batches not entirely self-contained (for example local temporary tables created and referenced across stored procedure boundaries). These exceptions reduce orthogonality and have been associated with unexpected 'by design' behaviours and bugs over the years.