SQL Server – Index Seek Scanning Whole Table Based on Parameter Value

execution-planperformancequery-performancesql servertype conversion

I have a query:

SELECT  Id, 
        ColumnA,
        ColumnB
FROM    MyTable
WHERE   ColumnA = @varA OR
        ColumnB = @varB  

The table is defined as

CREATE TABLE MyTable
(
    Id INT IDENTITY(-2147483648,1) PRIMARY KEY,
    ColumnA VARCHAR(22)
    ColumnB VARCAHR(22)
)

and there is a non clustered index on the table

CREATE INDEX IX_MyIndex ON MyTable
(
    ColumnA
)

When I run the query with the parameters below:

DECLARE @varA nvarchar(4000) = ''
DECLARE @varB  nvarchar(8) = '10140730'

The execution plan shows an index seek on IX_MyIndex, however it shows number of rows read as 17million rows but actual number of rows as 0 (There are 0 rows in MyTable.ColumnA with the value '')
If I turn SET STATISTICS IO ON I can see the full table is being read

This makes sense as per: this article in the section "Here’s a “bad” index seek"

However, when I run the same query with the parameters:

DECLARE @varA nvarchar(8) = 'a'
DECLARE @varB  nvarchar(8) = '10140730'

The seek operator doesn't have a "number of rows read" property (there are 0 rows MyTable.ColumnA in with value 'a') and SET STATISTICS IO reports single figure logical reads

Incidentally, the plan has an implicit convert warning and the issue goes away when I change the query like so:

SELECT  Id, 
        ColumnA,
        ColumnB
FROM    MyTable
WHERE   ColumnA = CONVERT(VARCHAR(22),@varA) OR
        ColumnB = CONVERT(VARCHAR(22),@varB)

Or change the underlying column to NVARCHAR

However, I am curious as to why the behaviour of the index seek with the two different values for @varA is different even though both of them return the same number of records in the table (0)

Best Answer

When there is a mismatch between the data types of the column and variable, SQL Server cannot directly use the seeking ability of a b-tree index to locate the correct range of values.

When the rules of data type precedence mean that the column data would have to be converted to the data type of the variable, this would mean scanning the whole table or index, converting each value and testing it against the variable as a residual predicate.

This is obviously not ideal, but so common (unfortunately) that SQL Server has a built-in way to achieve an index seek in these cases. It takes the supplied value and computes the range of values it maps to, accounting for the type conversion and collation.

This feature is known as a dynamic seek and the internal method that computes the mapped range is called GetRangeThroughConvert.

For example, when the nvarchar variable contains 'a', the mapped range of values for data type varchar might be 'a' to 'B' (the exact range depends on the collation). This means SQL Server can seek the varchar index between 'a' and 'B', testing only the matches for equality with 'a' (as nvarchar) as a residual predicate.

When the supplied value is the empty string, the computed range is infinite, so the whole index is effectively scanned.

For example:

DROP TABLE IF EXISTS dbo.MyTable;
GO
CREATE TABLE dbo.MyTable
(
    ColumnA varchar(22) COLLATE Latin1_General_CI_AS NOT NULL
);
GO
INSERT dbo.MyTable 
    WITH (TABLOCKX)
    (ColumnA)
SELECT TOP (1000)
    REPLICATE(CHAR(65 + ROW_NUMBER() OVER (ORDER BY @@SPID) % 26), 22)
FROM master.sys.all_columns AS AC1
CROSS JOIN master.sys.all_columns AS AC2;
GO
CREATE INDEX IX_MyIndex_A ON dbo.MyTable(ColumnA);

The following query uses a dynamic seek with a range of 'a' to 'B' and a residual predicate of CONVERT_IMPLICIT(nvarchar(22),[dbo].[MyTable].[ColumnA],0)=[@varA]:

DECLARE @varA nvarchar(22) = N'a';

SELECT MT.ColumnA 
FROM dbo.MyTable AS MT 
WHERE MT.ColumnA = @varA;

The execution plan shows the dynamic seek shape with 38 rows qualified by the seek, but all ultimately rejected by the residual:

dynamic seek plan

The 38 rows are those counted by the query:

SELECT COUNT_BIG(*)
FROM dbo.MyTable AS MT 
WHERE MT.ColumnA > 'a' 
AND MT.ColumnA < 'B';

When the variable contains an empty string, the calculated range is unbounded so the seek effectively scans the whole index:

DECLARE @varA nvarchar(22) = N''; -- empty string

SELECT MT.ColumnA 
FROM dbo.MyTable AS MT 
WHERE MT.ColumnA = @varA;

The execution plan shows all 1000 rows being read from the index (but again, discarded by the residual):

Seeking the whole table

The empty string is a special case where GetRangeThroughConvert cannot produce a useful range. A single space character does produce a narrow seek range (plan).

Anyway, the message is to pay careful attention to data types.

db<>fiddle