SQL Server – SQL Plan Management for Baseline

execution-planmigrationperformancesql serverupgrade

I am a .Net developer. I have a database upgrade coming up (from SQL Server 2005 to SQL Server 2012). The database team will be handling the migration. But I have seen forums posts saying that after migration, the queries are taking too long to execute even though amount of data and number of users are almost same. And one main reason is change in execution plan between these servers. Reference: Performance problem after migration from SQL Server 2005 to 2012

So, I am considering possibility of storing the actual execution plans from my current production server itself. In Oracle, I could find a feature for baselining:-

In SQL Server, what is the usual practice for achieving this?

Note: I referred Creating a baseline for SQL Server, yet it doesn't say anything about baselining execution plan

Best Answer

You would just pull the execution plans for the most relevant queries from the plan cache. I would query sys.dm_exec_query_stats, and join to sys.dm_exec_query_plan and sys.dm_exec_sql_text to get the plan and query text, respectively. Within sys.dm_exec_query_stats you can pull queries with the highest IO, CPU, execution count, etc. - whatever is most important to you.

Looking to the future, SQL Server 2016 introduces the Query Store:

Monitoring Performance By Using the Query Store

Related Solutions

Sql-server – Query tuning – SQL Server

To flesh out what Aaron states a little more, linked server performance, particularly for large result sets, for cross-server joins and cross-server subqueries, is often disappointing. If you watch the remote server with Profiler, you may find that the local server bombards the remote server with requests for the fetch of a single row to match join columns. When that happens, the network latency and calling overhead conspire to kill performance.

If you can query local data without too much trouble, that would be best. You might be able to restore a production backup or use SSIS (or even bcp, it still works) to copy data from the production server to some working tables on the local server. Generally SSIS, bcp and similar tactics are faster than linked servers and may help to avoid issues with log file growth.

If you must query the data from a remote server, you may find that rewriting the query so that it uses OPENQUERY() (rather than four-part names) and 'sends' all of the 'remote parts' of the query over to the remote server and then joins the results of that to the local data will be more effective. SQL is supposed to be smart enough to move all of the joins to the remote server, but sometimes it doesn't and OPENQUERY() gives you a method to force SQL to do want you want.

Another, similar, tactic would be to run the 'remote part' of the query first and put the results into a temporary table, (optionally) index the temporary table and then join the 'local part' of the query to the temporary table. Again, this helps you to force SQL to do what it ought to.

It sounds like more work, but SQL may be able to behave more efficiently. As always, watch your data types on the joins and your SARGs or your indexes will be ignored.

Sql-server – Why are there execution plan differences between OFFSET … FETCH and the old-style ROW_NUMBER scheme

The examples in the question do not quite produce the same results (the OFFSET example has an off-by-one error). The updated forms below fix that issue, remove the extra sort for the ROW_NUMBER case, and use variables to make the solution more general:

DECLARE 
    @PageSize bigint = 10,
    @PageNumber integer = 3;

WITH Numbered AS
(
    SELECT TOP ((@PageNumber + 1) * @PageSize) 
        o.*,
        rn = ROW_NUMBER() OVER (
            ORDER BY o.[object_id])
    FROM #objects AS o
    ORDER BY 
        o.[object_id]
)
SELECT
    x.name,
    x.[object_id],
    x.principal_id,
    x.[schema_id],
    x.parent_object_id,
    x.[type],
    x.type_desc,
    x.create_date,
    x.modify_date,
    x.is_ms_shipped,
    x.is_published,
    x.is_schema_published
FROM Numbered AS x
WHERE
    x.rn >= @PageNumber * @PageSize
    AND x.rn < ((@PageNumber + 1) * @PageSize)
ORDER BY
    x.[object_id];

SELECT
    o.name,
    o.[object_id],
    o.principal_id,
    o.[schema_id],
    o.parent_object_id,
    o.[type],
    o.type_desc,
    o.create_date,
    o.modify_date,
    o.is_ms_shipped,
    o.is_published,
    o.is_schema_published
FROM #objects AS o
ORDER BY 
    o.[object_id]
    OFFSET @PageNumber * @PageSize - 1 ROWS 
    FETCH NEXT @PageSize ROWS ONLY;

The ROW_NUMBER plan has an estimated cost of 0.0197935:

Row Number Plan

The OFFSET plan has an estimated cost of 0.0196955:

Offset Plan

That is a saving of 0.000098 estimated cost units (though the OFFSET plan would require extra operators if you want to return a row number for each row). The OFFSET plan will still be slightly cheaper, generally speaking, but do remember that estimated costs are exactly that - real testing is still required. The bulk of the cost in both plans is the cost of the full sort of the input set, so helpful indexes would benefit both solutions.

Where constant literal values are used (e.g. OFFSET 30 in the original example) the optimizer can use a TopN Sort instead of a full sort followed by a Top. When the rows needed from the TopN Sort is a constant literal and <= 100 (the sum of OFFSET and FETCH) the execution engine can use a different sort algorithm which can perform faster than generalized TopN sort. All three cases have different performance characteristics overall.

As to why the optimizer does not automatically transform the ROW_NUMBER syntax pattern to use OFFSET, there are a number of reasons:

It's almost impossible to write a transform that would match all existing uses
Having some paging queries automatically transformed and not others could be confusing
The OFFSET plan is not guaranteed to be better in all cases

One example for the third point above occurs where the paging set is quite wide. It can be much more efficient to seek the keys needed using a nonclustered index and manually lookup against the clustered index compared with scanning the index with OFFSET or ROW_NUMBER. There are additional issues to consider if the paging application needs to know how many rows or pages there are in total. There is another good discussion of the relative merits of the 'key seek' and 'offset' methods here.

Overall, it is probably better that people make an informed decision to change their paging queries to use OFFSET, if appropriate, after thorough testing.

Best Answer

Related Solutions

Sql-server – Query tuning – SQL Server

Sql-server – Why are there execution plan differences between OFFSET … FETCH and the old-style ROW_NUMBER scheme

Related Question