SQL Server – Behavior Differences Between SQL 2017 and SQL 2016 in 130 Compat Mode

sql-server-2016sql-server-2017upgrade

We have a new application which includes ETL scripts, R, and .NET code all actively being developed on SQL 2016 architecture.

I just recently received approval to start setting up new environments with SQL 2017.

I would like to understand if there are any code migration related changes that might need to be made for a SQL 2017 installation or would running on SQL 2017 with SQL 2016 DB compatibility mode behave the same as if it was just a SQL 2016 installation running with the full SQL 2016 (130) compatibility mode?

Reading through this link I found the following

To upgrade the SQL Server Database Engine to the latest version, while maintaining the database compatibility level that existed before the upgrade and its supportability status, it is recommended to perform static functional surface area validation of the application code in the database, by using the Microsoft Data Migration Assistant tool (DMA). The absence of errors in the DMA tool output, about missing or incompatible functionality, protects application from any functional regressions on the new target version.

Assuming this check passes, is there anything else that I should be doing or looking at?

Best Answer

It will not behave exactly the same. Compatibility mode works at the database level, not the Instance level, which is still 2017. But it will mostly work the same.

I would check the breaking feature list for 2017 as some Instance level changes may still affect your code, despite it being in 2016 compatibility mode. However, the breaking changes from 2017 are relatively minor so it is probably unlikely you'd be affected. In general, some version breaking changes are covered under compatibility mode and some are not.

The documentation gives good examples of this,

For example, the FASTFIRSTROW hint was discontinued in SQL Server 2012 (11.x) and replaced with the OPTION (FAST n ) hint. Setting the database compatibility level to 110 will not restore the discontinued hint.

and

An example of a breaking change protected by compatibility level is an implicit conversion from datetime to datetime2 data types. Under database compatibility level 130, these show improved accuracy by accounting for the fractional milliseconds, resulting in different converted values. To restore previous conversion behavior, set the database compatibility level to 120 or lower.

You should also look at the deprecated feature list for 2017 and try to remove any of those areas from future development in order to future-proof your applications. And of course, any new features from 2017 may require updates to take advantage of them as well, but I get the impression you're more concerned about breaking changes.

With all that said, you're probably fine, but should still carefully and thoroughly test upgrading the compatibility mode before moving it to production.

Related Solutions

SQL Server 2016 – Changes to Estimates on SUBSTRING() Predicates

I'm not aware of any documentation. I did look into this and make some observations however that are too long for a comment.

The 10% estimate is not always a degradation. Take the following example.

TRUNCATE TABLE dbo.StringTest

INSERT INTO dbo.StringTest
SELECT TOP (1000000) 'ZZ_' + LEFT(NEWID(), 12)
FROM   master..spt_values v1,
       master..spt_values v2;

and the WHERE clause in your question.

WHERE SUBSTRING(TheString, 1, CHARINDEX('_',TheString) - 1) = 'ZZ'

The table contains a million rows. All of them match the predicate. Under compat level 130 the 10% guess yields an estimate of 100,000. Under 120 the estimated rows is 1.03913.

The 120 behaviour uses the histogram but only to get the number of distinct rows. The density vector in my case shows 1.039131E-06 and this is multiplied by the table cardinality to get the estimated row count. All of the values are in fact different but all match the predicate.

Tracing the query_optimizer_estimate_cardinality extended event shows that under 130 there are two different <StatsCollection Name="CStCollFilter" events. The first one estimates 100,000. The second one loads the histogram and uses the CSelCalcPointPredsFreqBased/DistinctCountCalculator to get the 1.04 estimate. This second result appears unused.

The behavior that you observed is not consistently applied in 130. I added ORDER BY TheString expecting this to be a clear win for the 130 estimator as the 120 struggles on with a memory grant for one row but this minor change was sufficient to bring the estimated rows down to 1.03913 in the 130 case too.

Adding OPTION (QUERYRULEOFF SelectToFilter) reverts the estimate going into the sort to 100,000 but the memory grant doesn't increase and the estimates coming out the sort are still based on the table distinct values.

Similarly tweaking the cost threshold for parallelism so that the query gets a parallel plan was sufficient in the 130 case to revert to the lower estimate. Adding QUERYTRACEON 8757 also causes the lower estimate. It looks like the 10% estimate is only retained for trivial plans.

Your proposed rewrite with

WHERE TheString LIKE 'ZZ[_]%'

Shows much superior estimates to both. The output for this is

  CSelCalcTrieBased

      Column: QCOL: [MyStringTestDB].[dbo].[StringTest].TheString

Showing that it used tries. More info about this is in the string summary statistics section just above here.

It is not the same as your original query however. As the first instance of _ is now assumed to always be the third character rather than being found dynamically.

If this assumption is hardcoded into your original query

 WHERE SUBSTRING(TheString, 1, 3) = 'ZZ_'

The estimation method changes to CSelCalcHistogramComparison(INTERVAL) and the estimated rows become accurate.

It is able to convert that into a range

WHERE TheString >=  'ZZ_' AND TheString < ???

and use the histogram to estimate the number of rows with values in that range.

This applies only to the cardinality estimation however. LIKE is preferable as it can use a range seek at runtime. SUBSTRING(TheString, 1, 3) or LEFT(TheString, 3) can't.

SQL Server Upgrade 2008 to 2016 – Compatibility Slow Queries

Microsoft has an upgrade strategy for changing the compatibility mode on SQL Server 2016. Quoting the linked article:

The recommended workflow for upgrading the query processor to the latest version of the code is:

Upgrade a database to SQL Server 2016 without changing the database compatibility level (keep it at prior level)

Enable the query store on the database. For more information about enabling and using the query store, see Monitoring Performance By Using the Query Store.

Wait sufficient time to collect representative data of the workload.

Change the compatibility level of the database to 130

Using SQL Server Management Studio, evaluate if there are performance regressions on specific queries after the compatibility level change

For cases where there are regressions, force the prior plan in the query store.

If there are query plans that fail to force or if performance is still insufficient, consider reverting the compatibility level to the prior setting and then engaging Microsoft Customer Support.

You could try a version of that for your situation. Change the compatibility mode back to 100, enable the query store, go through a full business cycle and get a good baseline, then change compatibility mode and use the query store to analyze poorly running queries and take further action on them.

Best Answer

Related Solutions

SQL Server 2016 – Changes to Estimates on SUBSTRING() Predicates

SQL Server Upgrade 2008 to 2016 – Compatibility Slow Queries

Related Question