Sql-server – Need sample script for Cluster index scan

optimizationsql serversql server 2014

Do we have any example which is behaving as cluster index scan when ever queried with range of values. I am trying to get the conclusion on optimizer behavior as when ever the select count increases to more than 10% even the table have cluster index optimizer will take cluster index scan.

The related Query asked in below url as well.
Clarification on Clustered index scan and Clustered Index Seek

Best Answer

Benjamic Navarez has an excellent article on how the cost-based optimizer in SQL Server works.

Essentially, when SQL Server is asked to read a table it looks at several methods of returning the requested data, and chooses the "best" path it can by assigning costs to each type of operation. It then chooses the cheapest path, or plan. This may or may not include either seeking the clustered index, scanning the clustered index, or perhaps doing seeks or scans on the non-clustered indexes, if they exist.

Since the plan is cost-based, there is no single piece of code that will always perform a clustered-index scan, except perhaps for the case of SELECT * FROM dbo.[table with a clustered index];. Even adding an ORDER BY ... clause to that statement may make SQL Server choose a different plan.

In your question you mention that your query performs a clustered index scan whenever the selected number of rows climbs above 10% of the total rows in the table. The number of rows requested versus the total number of rows in the table is part of the cost-based-optimization, and is likely causing SQL Server to think it will be faster to simply scan the entire clustered index (the clustered index is the table), rather than using a series of index seeks followed by lookups.

Look at point #2 in this article by Aaron Bertrand. Rob Farley also has a good write-up about it.

Martin Smith's answer on this question also contains some excellent details on this topic.

Related Solutions

Sql-server – Why is table variable forcing an index scan while temp table uses seek and bookmark lookup

The reason for the behavior is that SQL Server can't determine how many rows will match to ForeignKey, since there is no index with RowKey as the leading column (it can deduce this from statistics on the #temp table, but those don't exist for table variables/UDTTs), so it makes an estimate of 100,000 rows, which is better handled with a scan than a seek+lookup. By the time SQL Server realizes there is only one row, it's too late.

You might be able to construct your UDTT differently; in more modern versions of SQL Server you can create secondary indexes on table variables, but this syntax is not available in 2008 R2.

BTW you can get the seek behavior (at least in my limited trials) if you try to avoid the bitmap/probe by hinting a nested loops join:

DECLARE @Keys TABLE (RowKey INT PRIMARY KEY); -- can't hurt

INSERT @Keys (RowKey) VALUES (10);

SELECT 
     t.RowKey
    ,t.SecondColumn
FROM
    dbo.Test t 
INNER JOIN 
    @Keys k
ON
    t.ForeignKey = k.RowKey
    OPTION (LOOP JOIN);

I learned this trick from Paul White several years ago. Of course, you should be careful about putting any kind of join hints in production code - this can fail if people make changes to the underlying objects and that specific type of join is no longer possible or no longer most optimal.

For more complex queries, and when you move to SQL Server 2012 or above, it's possible that trace flag 2453 could help. That flag didn't help with this simple join, though. And the same disclaimers would apply - this is just an alternative thing you shouldn't generally do without a ton of documentation and rigorous regression testing procedures in place.

Also, Service Pack 1 is long out of support, you should get on Service Pack 3 + MS15-058.

Sql-server – Why is the EXISTS query doing an index scan instead of an index seek

If you want good results from the query optimizer, it pays to be careful about data types.

Your variables are typed as datetime2:

DECLARE @OrderStartDate datetime2 = '27 feb 2016';
DECLARE @OrderEndDate  datetime2 = '28 feb 2016';

But the column these are compared to is typed smalldatetime (as the sdtm prefix suggests!):

[sdtmOrdCreated] SMALLDATETIME NOT NULL

The type incompatibility makes it hard for the optimizer to work out the resulting cardinality estimate through a type conversion, as shown in the execution plan xml:

<ScalarOperator ScalarString="GetRangeWithMismatchedTypes([@OrderStartDate],NULL,(22))">
<ScalarOperator ScalarString="GetRangeWithMismatchedTypes(NULL,[@OrderEndDate],(10))">

The current estimate may or may not be accurate (probably not). Fixing the type incompatibility may or may not completely solve your plan selection problem, but it is the first (easy!) thing I would fix before looking deeper into the issue:

DECLARE @OrderStartDate smalldatetime = CONVERT(smalldatetime, '20160227', 112);
DECLARE @OrderEndDate smalldatetime = CONVERT(smalldatetime, '20160228', 112);

Always check the accuracy of cardinality estimates, and the reason for any discrepancy before deciding to rewrite the query or use hints.

See my SQLblog.com article, "Dynamic Seeks and Hidden Implicit Conversions" for more details on the dynamic seek.

Update: Fixing the data type got you the seek plan you wanted. The cardinality estimation errors caused by the type conversion before gave you the slower plan.

Best Answer

Related Solutions

Sql-server – Why is table variable forcing an index scan while temp table uses seek and bookmark lookup

Sql-server – Why is the EXISTS query doing an index scan instead of an index seek

Related Question