Sql-server – SQL Server 2016 not using index used in 2012

index-tuningnonclustered-indexsql serversql-server-2016

2016 plan here

2012 plan here

I have almost identical tables with very small difference in row count. One on 2012, and the other on 2016. Indexing is identical. These VMs are in exact same environment with upgraded OS and SQL Server versions. Same number vcores, same memory, same server settings (max degree of parallelism = 8, cost threshold for parallelism = 30).

This drop dead simple query for single record using a single column for filter and single column return. The column in the where filter is the sole column in the index.

2016 version has 8254356 rows
2012 version has 8254427 rows

They are the same queries. 2016 is missing the index and doing full table scan for no apparent reason. 2012 does a RID lookup (Heap) on the table after the index scan.

I tried WITH (index = CONTACT_RC_NUI1) on 2016 server and the cost jumped from 991 to 1889. On 2012 the cost was 29.

I tried adding AND 1 = (SELECT 1) and that made no difference.
I tried removing parameter sniffing as a possible problem by using OPTION (RECOMPILE) and that made no difference.

The DBA ran index rebuilds after restoring the database. Both servers had fairly recent index stats updates (we run Ola's index update script). And to be sure I rebuilt the index on 2016 which had no effect on the 2016 explain plan.

I added hint below in the query…

select address1_stateorprovince 
from dla.dcrm.CONTACT_RC WITH (index = CONTACT_RC_NUI1)
where wv_partyid = 343083;

This resulted in taking cost from 991 to 1889 even though it shows almost identical to 2012 explain plan (2016 just added parallelism (gather streams)).

What 2016 appears to be doing is costing the index for only 1% but the RID lookup is 99%. In 2012 this was reversed. It appears 2016 used the index to scan all the entries and then looked up every RID in the table? Could that be true? I think 2016 optimizer has been smoking something seriously strong.

wv_party_id is nvarchar(100)

Best Answer

You see different plans because from SQL Server 2014 there is a new cardinality estimator in SQL Server. And then they added some new features to SQL Server 2016 for the the new CE.

First some test data to reproduce what you see.

create table dbo.T(C1 char(10) default '', C2 varchar(11));

go

insert into dbo.T(C2)
select top(800000) row_number() over(order by (select null))
from sys.columns as c1, sys.columns as c2, sys.columns as c3

go

create index IX_T_C2 on dbo.T(C2)

And the queries that will produce the two different plans for you so you can compare them in the same version of SQL Server.

-- Table scan version
select C1
from dbo.T with (index = 0)
where C2 = 100000
option (maxdop 1);

-- Index Scan version
select C1
from dbo.T with (index = IX_T_C2)
where C2 = 100000
option (maxdop 1);

The table scan version in SQL Server 2012 scans all rows and returns one. No surprise there.

The Index scan version of SQL Server 2012 scans all rows in the index and returns one row. There is something there that needs to be looked on further but for now you should take an extra look at the Estimated Number of Rows for the Index Scan operator.

The Table Scan version of SQL Server 2016 is no different than the version in 2012. It scans all rows returning one row.

The Index Scan version looks the same as in 2012 but the cost is much higher and that is because Estimated Number of Rows is much higher than in 2012

So SQL Server now thinks it has to do 80000 RID Lookup to return 1 row and that is why it chooses the Table Scan in SQL Server 2016 with the new cardinality estimator.

The new estimator sees the predicate where CONVERT_IMPLICIT(int,C2) = 100000 and gives up. It uses the standard guess of 10% selectivity for an equality predicate, where 10% of 800,000 rows = 80,000. The original estimator used more complex logic to produce a non-guess (accurate!) estimate of one row.

Now to the issue with the Index Scan. That is probably not what you want. You want SQL Server to do an Index Seek finding the row you are looking for. Currently SQL Server can't do that because you have a type mismatch in the where clause and you do get warnings about it in the query plan. Fix that and you should see a plan with an Index Seek and a RID Lookup in both versions of SQL Server.

Note also that cost percentages in execution plans are always based on optimizer estimates, not real run time information.

Related Solutions

Sql-server – Why is the index not being used in a SELECT TOP

If I let the server decide which index to use, it picks IX_MachineryId, and it takes up to a minute.

That index is not partitioned, so the optimizer recognizes it can be used to provide the ordering specified in the query without sorting. As a non-unique nonclustered index, it also has the keys of the clustered index as subkeys, so the index can be used to seek on MachineryId and the DateRecorded range:

The index does not include OperationalSeconds, so the plan has to look that value up per row in the (partitioned) clustered index in order to test OperationalSeconds > 0:

The optimizer estimates that one row will need to be read from the nonclustered index and looked up to satisfy the TOP (1). This calculation is based on the row goal (find one row quickly), and assumes a uniform distribution of values.

From the actual plan, we can see the estimate of 1 row is inaccurate. In fact, 19,039 rows have to be processed to discover that no rows satisfy the query conditions. This is the worst case for a row goal optimization (1 row estimated, all rows actually needed):

You can disable row goals with trace flag 4138. This would most likely result in SQL Server choosing a different plan, possibly the one you forced. In any case, the index IX_MachineryId could be made more optimal by including OperationalSeconds.

It is quite unusual to have non-aligned nonclustered indexes (indexes partitioned in a different way from the base table, including not at all).

That really suggests to me that I have made the index right, and the server is just making a bad decision. Why?

As usual, the optimizer is selecting the cheapest plan it considers.

The estimated cost of the IX_MachineryId plan is 0.01 cost units, based on the (incorrect) row goal assumption that one row will be tested and returned.

The estimated cost of the IX_MachineryId_DateRecorded plan is much higher, at 0.27 units, mostly because it expects to read 5,515 rows from the index, sort them, and return the one that sorts lowest (by DateRecorded):

This index is partitioned, and cannot return rows in DateRecorded order directly (see later). It can seek on MachineryId and the DateRecorded range within each partition, but a Sort is required:

If this index were not partitioned, a sort would not be required, and it would be very similar to the other (unpartitioned) index with the extra included column. An unpartitioned filtered index would be slightly more efficient still.

You should update the source query so that the data types of the @From and @To parameters match the DateRecorded column (datetime). At the moment, SQL Server is computing a dynamic range due to the type mismatch at runtime (using the Merge Interval operator and its subtree):

<ScalarOperator ScalarString="GetRangeWithMismatchedTypes([@From],NULL,(22))">
<ScalarOperator ScalarString="GetRangeWithMismatchedTypes([@To],NULL,(22))">

This conversion prevents the optimizer from reasoning correctly about the relationship between ascending partition IDs (covering a range of DateRecorded values in ascending order) and the inequality predicates on DateRecorded.

The partition ID is an implicit leading key for a partitioned index. Normally, the optimizer can see that ordering by partition ID (where ascending IDs map to ascending, disjoint values of DateRecorded) then DateRecorded is the same as ordering by DateRecorded alone (given that MachineryID is constant). This chain of reasoning is broken by the type conversion.

Demo

A simple partitioned table and index:

CREATE PARTITION FUNCTION PF (datetime)
AS RANGE LEFT FOR VALUES ('20160101', '20160201', '20160301');

CREATE PARTITION SCHEME PS AS PARTITION PF ALL TO ([PRIMARY]);

CREATE TABLE dbo.T (c1 integer NOT NULL, c2 datetime NOT NULL) ON PS (c2);

CREATE INDEX i ON dbo.T (c1, c2) ON PS (c2);

INSERT dbo.T (c1, c2) 
VALUES (1, '20160101'), (1, '20160201'), (1, '20160301');

Query with matched types

-- Types match (datetime)
DECLARE 
    @From datetime = '20010101',
    @To datetime = '20090101';

-- Seek with no sort
SELECT T2.c2 
FROM dbo.T AS T2 
WHERE T2.c1 = 1 
AND T2.c2 >= @From
AND T2.c2 < @To
ORDER BY 
    T2.c2;

Query with mismatched types

-- Mismatched types (datetime2 vs datetime)
DECLARE 
    @From datetime2 = '20010101',
    @To datetime2 = '20090101';

-- Merge Interval and Sort
SELECT T2.c2 
FROM dbo.T AS T2 
WHERE T2.c1 = 1 
AND T2.c2 >= @From
AND T2.c2 < @To
ORDER BY 
    T2.c2;

Sql-server – Index issue on upgrade from SQL Server 2000 to 2012

The unstated assumption in the question is that the subquery is executed first, then the outer DELETE is processed. This is not how things work. People write queries that express a logical requirement, then the SQL Server query optimizer tries to find an efficient physical implementation.

The optimizer's decisions are driven by cost estimates for the various possible physical options it explores.

Garbage In, Garbage Out

By using a table variable, the current arrangement deprives the optimizer of two important pieces of information: the number of rows in the table (cardinality); and the distribution of those values (statistics).

In most cases, the optimizer is unable to see the cardinality of a table variable, and guesses at one row. The physical execution strategy it chooses on the basis that there is one row in the table variable is very likely suboptimal for half a million rows.

Assuming one row, the optimizer may well decide that scanning the PersonStudy table looking for matches is a good enough strategy:

In practice, this plan results in the PersonStudy table being scanned 500,000 times at runtime (once per row in the table variable). That is potentially 60,000 * 500,000 = 30 billion rows. No wonder it takes a while.

Given incorrect or incomplete information about the data, the chances are pretty high that the optimizer will deliver a poor execution plan.

Advice

Use a temporary table (e.g. #ExcludedPersons) instead of a table variable. This will provide accurate cardinality information, and allow SQL Server to automatically create statistics.
Constrain the personID columns to be NOT NULL. This gives the optimizer useful information and will allow it to avoid a common problem with NOT IN.
Make the personID column in the temporary table (or table variable) the PRIMARY KEY. Again, this provides useful information to the optimizer (uniqueness, ordering, not null).
Provide a useful index on the PersonStudy table. The suggested index is a reasonable choice, but there may be better options. A good index provides the optimizer with a more efficient data access path.

Especially if you are unable to switch to using a temporary table, test the following (but still add the constraints and indexes/keys mentioned above):

Add an OPTION (RECOMPILE) hint to the query. This will allow the optimizer to see the cardinality of the table variable (but not statistical distribution) at runtime.
Or: Use an OPTION (HASH JOIN) hint. Hash join scales better than nested loops with a table scan. The hash join may well spill and reverse roles at runtime, but this should still be very significantly better than what you have right now.
Or: If your workload often uses table variables with a significant number of rows, test the impact of enabling trace flag 2453. This will expose cardinality as above, without the (typically small) overhead of a plan recompilation.