SQL Server – Should Query Plans Be Split by Statement for Reusability?

execution-planplan-cachesql server

From my limited knowledge of how query plans are compiled, stored and retrieved by queries I understand that a multi statement query or stored procedure will generate it's query plan which will be stored in the query plan cache to be used by the query in future executions.

I think this plan is retrieved from the query plan cache with the query hash, which means if the query is edited and executed the hash is different and a new plan is generated as no matching hash can be found in the query plan cache.

My question is: If a user executes a statement that is one of the statements in the multi-statement query can it use that relevant part of the query plan already in the cache for the multi-statement query? I expect the answer is no because the hash values will obviously not match, but would it be better to hash each statement in a multi-statement query so they could be used by users running individual statements from the query?

I expect there are complications that I'm not taking into account (And it's these that I really want to know about) but it seems like we could be storing the same 'statement plan' in many query plans taking up more space and taking more CPU and time to generate.

Could just be showing my ignorance though.

Best Answer

If a user executes a statement that is one of the statements in the multi-statement query can it use that relevant part of the query plan already in the cache for the multi-statement query?

No. The basic unit of plan reuse in SQL Server is the batch.

Would it be better to hash each statement in a multi-statement query so they could be used by users running individual statements from the query?

A system tuned for high levels of plan reuse will place common code (at a suitable granularity) in reusable objects (e.g. procedures, functions, triggers) on the SQL Server. It will also explicitly parameterize any application-generated or client-side code. For maximum plan reuse, these generated batches should differ only in parameter values.

I expect there are complications that I'm not taking into account

It sounds like you are asking why SQL Server was designed to cache and reuse at the batch level, rather than at the statement level. I doubt anyone aside from the original designers could answer this question authoritatively. Anyway, it seems to me that a batch is a natural granularity to use because it is relatively self-contained natural unit of work, and represents a reasonable trade-off between implementation complexity and plan reuse probability.

There are a few things that make batches not entirely self-contained (for example local temporary tables created and referenced across stored procedure boundaries). These exceptions reduce orthogonality and have been associated with unexpected 'by design' behaviours and bugs over the years.

Related Solutions

SQL Server – Query Plan Cache Bloated by Ad-Hoc Queries

According to this, the second run of an ad hoc batch removes the stub (which was used only once) and creates and caches the plan (using it for the first time).

I also haven't seen many references to refcounts other than it being a count of references by cache objects. Adhoc Compiled Plan objects can still have a refcount of 1, so it's not exclusively caused by the persistence of the plan.

Execution Plan Caching for Stored Procedures vs Non-Dynamic Queries

_{The answer is also available as a standalone blog article.}

In order to find it out, I did some tests. The goal is to have the same parametrized query executed either directly from C# or by calling a stored procedure and to compare the runtime performance.

I started to create a stored procedure which does a sample query using Adventure Works database:

create procedure Demo
    @minPrice int 
as
begin
    set nocount on;

    select top 1 [p].[Name], [p].[ProductNumber], [ph].[ListPrice]
    from [Production].[Product] p
    inner join [Production].[ProductListPriceHistory] ph
    on [p].[ProductID] = ph.[ProductID]
    and ph.[StartDate] =
    (
        select top 1 [ph2].[StartDate]
        from [Production].[ProductListPriceHistory] ph2
        where [ph2].[ProductID] = [p].[ProductID]
        order by [ph2].[StartDate] desc
    )
    where [p].[ListPrice] > @minPrice
end

Then, I use the following piece of code to compare the performances:

long RunQuery(SqlConnection connection, int minPrice)
{
    const string Query = @"
    select top 1 [p].[Name], [p].[ProductNumber], [ph].[ListPrice]
    from [Production].[Product] p
    inner join [Production].[ProductListPriceHistory] ph
    on [p].[ProductID] = ph.[ProductID]
    and ph.[StartDate] =
    (
        select top 1 [ph2].[StartDate]
        from [Production].[ProductListPriceHistory] ph2
        where [ph2].[ProductID] = [p].[ProductID]
        order by [ph2].[StartDate] desc
    )
    where [p].[ListPrice] > @minPrice
    option (recompile)";

    using (var command = new SqlCommand(Query, connection))
    {
        command.Parameters.AddWithValue("@minPrice", minPrice);
        var stopwatch = Stopwatch.StartNew();
        command.ExecuteNonQuery();
        stopwatch.Stop();
        return stopwatch.ElapsedMilliseconds;
    }
}

long RunStoredProcedure(SqlConnection connection, int minPrice)
{
    using (var command = new SqlCommand("exec Demo @minPrice with recompile", connection))
    {
        command.Parameters.AddWithValue("@minPrice", minPrice);
        var stopwatch = Stopwatch.StartNew();
        command.ExecuteNonQuery();
        stopwatch.Stop();
        return stopwatch.ElapsedMilliseconds;
    }
}

ICollection<long> Execute(Func<SqlConnection, int, long> action)
{
    using (var connection = new SqlConnection("Server=.;Database=AdventureWorks2014;Trusted_Connection=True;"))
    {
        connection.Open();
        using (var command = new SqlCommand("DBCC FreeProcCache; DBCC DropCleanbuffers;", connection))
        {
            command.ExecuteNonQuery();
        }

        return Enumerable.Range(0, 100).Select(i => action(connection, i)).ToList();
    }
}

void Main()
{
    var queries = Execute(RunQuery);
    var storedProcedures = Execute(RunStoredProcedure);

    Console.WriteLine("Stored procedures: {0} ms. Details: {1}.", storedProcedures.Sum(), string.Join(", ", storedProcedures));
    Console.WriteLine("Queries: {0} ms. Details: {1}.", queries.Sum(), string.Join(", ", queries));
}

Notice option (recompile) and with recompile. This will force SQL Server to discard previously cached execution plans.

Each query is run une hundred times with a different parameter every time. The time spent by the server is measured at client side.

By running DBCC FreeProcCache; DBCC DropCleanbuffers; before gathering metrics, I make sure that all previously cached execution plans are removed.

Running this code gives the following output:

Stored procedures: 786 ms. Details: 12, 7, 7, 9, 7, 7, 9, 8, 8, 6, 8, 9, 8, 8, 14, 8, 7, 8, 7, 10, 10, 7, 9, 6, 9, 8, 8, 7, 7, 10, 8, 7, 7, 6, 7, 8, 8, 7, 7, 7, 14, 8, 8, 8, 7, 9, 8, 8, 7, 6, 6, 12, 7, 7, 8, 7, 8, 7, 8, 6, 7, 7, 7, 12, 8, 6, 6, 7, 8, 7, 8, 8, 7, 11, 8, 7, 8, 8, 7, 9, 8, 9, 10, 8, 7, 7, 8, 8, 7, 9, 7, 6, 9, 7, 6, 9, 8, 6, 6, 6.
Queries: 799 ms. Details: 21, 8, 8, 7, 6, 6, 11, 7, 6, 6, 9, 8, 8, 7, 9, 8, 7, 7, 7, 7, 7, 7, 10, 8, 8, 7, 8, 7, 6, 11, 19, 10, 8, 7, 8, 7, 7, 7, 6, 9, 7, 9, 7, 7, 8, 7, 12, 9, 7, 7, 7, 8, 7, 7, 8, 7, 7, 7, 9, 8, 7, 7, 7, 6, 7, 7, 16, 7, 7, 7, 8, 8, 9, 8, 7, 9, 8, 7, 8, 7, 7, 6, 7, 7, 7, 7, 12, 7, 9, 9, 7, 7, 7, 7, 9, 8, 7, 8, 11, 8.

Let's run it again:

Stored procedures: 763 ms. Details: 11, 8, 10, 8, 8, 14, 10, 6, 7, 7, 6, 7, 7, 9, 6, 6, 6, 8, 6, 6, 7, 6, 8, 7, 16, 8, 7, 8, 9, 7, 7, 8, 7, 7, 11, 10, 7, 6, 7, 8, 7, 7, 7, 7, 7, 7, 10, 9, 9, 7, 6, 7, 6, 7, 7, 6, 6, 6, 6, 6, 10, 9, 10, 7, 6, 6, 6, 6, 6, 8, 7, 6, 6, 7, 8, 9, 7, 8, 7, 10, 7, 7, 7, 6, 7, 6, 7, 11, 13, 8, 7, 10, 9, 8, 8, 7, 8, 7, 7, 7.
Queries: 752 ms. Details: 25, 10, 8, 8, 12, 8, 7, 9, 9, 8, 6, 7, 7, 6, 8, 6, 7, 7, 8, 9, 7, 7, 7, 7, 6, 10, 8, 7, 7, 7, 7, 7, 7, 7, 8, 9, 7, 6, 6, 6, 7, 13, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 6, 10, 7, 7, 8, 9, 8, 7, 6, 6, 7, 7, 9, 7, 8, 6, 9, 7, 7, 8, 7, 6, 6, 7, 7, 7, 7, 6, 7, 7, 8, 7, 7, 6, 7, 9, 8, 7, 7, 7, 7, 6, 7, 6, 6, 9, 7, 7.

It seems that the performance is very close between stored procedures and direct queries. Running the code a dozen times, I notice that stored procedures seem to be slightly fast, but the gap is very narrow. Possibly passing around the whole query creates this additional cost, which may increase if SQL Server is hosted on a dedicated machine with a slow LAN between it and the application server.

Let's now turn execution plan caching on and see what happens. To do this, I remove option (recompile) and with recompile from the code. Here's the new output:

Stored procedures: 26 ms. Details: 23, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0.
Queries: 15 ms. Details: 14, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0.

It becomes clear that caching has exactly the same effect for both direct queries and stored procedures. In both cases, it reduces the time to nearly zero milliseconds, and the most expensive query is the first one—the one which runs after the removal of cached execution plans.

Running the same code again shows a similar pattern. Sometimes, queries are faster, and sometimes stored procedures are. But every time, the first query is the most expensive one, and all others are close to zero milliseconds.

Reopening SQL connection

If the SQL connection is opened for every query, such as in this slightly modified code:

long RunQuery(string connectionString, int minPrice)
{
    const string Query = @"
    select top 1 [p].[Name], [p].[ProductNumber], [ph].[ListPrice]
    from [Production].[Product] p
    inner join [Production].[ProductListPriceHistory] ph
    on [p].[ProductID] = ph.[ProductID]
    and ph.[StartDate] =
    (
        select top 1 [ph2].[StartDate]
        from [Production].[ProductListPriceHistory] ph2
        where [ph2].[ProductID] = [p].[ProductID]
        order by [ph2].[StartDate] desc
    )
    where [p].[ListPrice] > @minPrice
    option (recompile)";

    using (var connection = new SqlConnection(connectionString))
    {
        connection.Open();
        using (var command = new SqlCommand(Query, connection))
        {
            command.Parameters.AddWithValue("@minPrice", minPrice);
            var stopwatch = Stopwatch.StartNew();
            command.ExecuteNonQuery();
            stopwatch.Stop();
            return stopwatch.ElapsedMilliseconds;
        }
    }
}

long RunStoredProcedure(string connectionString, int minPrice)
{
    using (var connection = new SqlConnection(connectionString))
    {
        connection.Open();
        using (var command = new SqlCommand("exec Demo @minPrice with recompile", connection))
        {
            command.Parameters.AddWithValue("@minPrice", minPrice);
            var stopwatch = Stopwatch.StartNew();
            command.ExecuteNonQuery();
            stopwatch.Stop();
            return stopwatch.ElapsedMilliseconds;
        }
    }
}

ICollection<long> Execute(Func<string, int, long> action)
{
    var connectionString = "Server=.;Database=AdventureWorks2014;Trusted_Connection=True;";
    using (var connection = new SqlConnection(connectionString))
    {
        connection.Open();
        using (var command = new SqlCommand("DBCC FreeProcCache; DBCC DropCleanbuffers;", connection))
        {
            command.ExecuteNonQuery();
        }
    }

    return Enumerable.Range(0, 100).Select(i => action(connectionString, i)).ToList();
}

void Main()
{
    var queries = Execute(RunQuery);
    var storedProcedures = Execute(RunStoredProcedure);

    Console.WriteLine("Stored procedures: {0} ms. Details: {1}.", storedProcedures.Sum(), string.Join(", ", storedProcedures));
    Console.WriteLine("Queries: {0} ms. Details: {1}.", queries.Sum(), string.Join(", ", queries));
}

the observed metrics are very similar:

Stored procedures: 748 ms. Details: 11, 8, 6, 6, 8, 9, 9, 8, 8, 7, 6, 8, 7, 9, 6, 6, 6, 6, 6, 6, 7, 7, 6, 9, 6, 6, 7, 6, 6, 7, 8, 6, 7, 7, 7, 13, 7, 7, 8, 7, 8, 8, 7, 7, 7, 7, 6, 7, 8, 8, 8, 9, 7, 6, 8, 7, 6, 7, 6, 6, 6, 6, 8, 12, 7, 9, 9, 6, 7, 7, 7, 8, 10, 12, 8, 7, 6, 9, 8, 7, 6, 6, 7, 8, 6, 6, 12, 7, 8, 10, 10, 7, 8, 7, 8, 10, 8, 7, 8, 7.
Queries: 761 ms. Details: 31, 9, 7, 6, 6, 8, 7, 7, 7, 7, 7, 6, 8, 7, 6, 6, 7, 10, 8, 10, 9, 7, 7, 7, 7, 10, 13, 7, 10, 7, 6, 6, 6, 8, 7, 7, 7, 7, 7, 7, 7, 9, 7, 7, 7, 6, 6, 6, 9, 7, 7, 7, 7, 7, 6, 8, 10, 7, 7, 7, 7, 7, 7, 7, 8, 6, 10, 10, 7, 8, 8, 7, 7, 7, 7, 7, 6, 6, 7, 6, 8, 7, 7, 7, 7, 7, 7, 7, 8, 7, 8, 7, 9, 7, 6, 6, 12, 10, 7, 6.

with option (recompile) and with recompile and:

Stored procedures: 15 ms. Details: 14, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0.
Queries: 32 ms. Details: 26, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0.

without.

Under the hood

Let's see what happens under the hood. The following query shows cached execution plans:

select usecounts, size_in_bytes, cacheobjtype, objtype, text 
from sys.dm_exec_cached_plans 
cross apply sys.dm_exec_sql_text(plan_handle)
where cacheobjtype = 'Compiled Plan'
order by usecounts desc

When running this query after executing the stored procedures one hundred times, the result of the query looks like this:

usecounts   size_in_bytes cacheobjtype                                       objtype              text
----------- ------------- -------------------------------------------------- -------------------- ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
100         90112         Compiled Plan                                      Proc                 create procedure Demo
    @minPrice int 
as
begin
    set nocount on;

    select top 1 [p].[Name], [p].[ProductNumber], [ph].[ListPrice]
    from [Production].[Product] p
    inner join [Production].[ProductListPriceHistory] ph
    on [p].[ProductID] = ph.[Product
100         16384         Compiled Plan                                      Prepared             (@minPrice int)exec Demo @minPrice --with recompile
1           49152         Compiled Plan                                      Adhoc                --DBCC FreeProcCache
--DBCC DropCleanbuffers

select usecounts, size_in_bytes, cacheobjtype, objtype, text 
from sys.dm_exec_cached_plans 
cross apply sys.dm_exec_sql_text(plan_handle)
where cacheobjtype = 'Compiled Plan'
order by usecounts desc

(3 row(s) affected)

When running the query directly one hundred times, the result is:

usecounts   size_in_bytes cacheobjtype                                       objtype              text
----------- ------------- -------------------------------------------------- -------------------- ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
100         73728         Compiled Plan                                      Prepared             (@minPrice int)
    select top 1 [p].[Name], [p].[ProductNumber], [ph].[ListPrice]
    from [Production].[Product] p
    inner join [Production].[ProductListPriceHistory] ph
    on [p].[ProductID] = ph.[ProductID]
    and ph.[StartDate] =
    (
        select top 1 [ph2].[
1           49152         Compiled Plan                                      Adhoc                --DBCC FreeProcCache
--DBCC DropCleanbuffers

select usecounts, size_in_bytes, cacheobjtype, objtype, text 
from sys.dm_exec_cached_plans 
cross apply sys.dm_exec_sql_text(plan_handle)
where cacheobjtype = 'Compiled Plan'
order by usecounts desc

(2 row(s) affected)

Conclusion

The execution plan is cached for stored procedures and direct queries.
The performance between stored procedures and direct queries is very similar when the SQL Server and the application are hosted on the same machine. When SQL Server is hosted on a dedicated server accessed through LAN, using stored procedures may result in better performance.