Sql-server – How SQL Server knows when to implicitly convert the values

sql serversql server 2014type conversion

Following my previous question: Implicit conversion does not affect performance

I used below simple query

select count(*)
from fpc
where SKey in ('201701', '201702A')

SKey is of type int and I have non clustered columnstore index on it.

obviously I cannot run this because the second value is not a number. so I hit Ctrl+L to see estimated execution plan and I see an interesting thing in Predicate property:

[mydb].[dbo].[fpc].[SKey]=CONVERT_IMPLICIT(int,'201702A',0) OR
[mydb].[dbo].[fpc].[SKey]=(201701)

My question is why SQL uses '201701' as a number but uses implicit conversion on the second value: '201702A'

what I'm interested in is the internal mechanism how SQL server looks into these two values. Does it know the first one is a number and the second one is not?

Best Answer

My question is why SQL uses '201701' as a number but uses implicit conversion on the second value: '201702A'

what I'm interested in is the internal mechanism how SQL server looks into these two values. Does it know the first one is a number and the second one is not?

SQL Server attempts to convert both strings to the correct type (according to the data type precedence rules) for comparison with the integer Skey column during the Constant Folding phase of query compilation. This activity occurs very early in the process, well before even the simplest of query plans is considered.

When constant folding is successful, the input tree contains the derived literal value (as the correct type) and optimization continues, just as if the query writer had used a constant rather than an expression.

When constant folding is unsuccessful (for example because the conversion would throw an error) the tree contains a conversion function. It would not be correct to throw an error at compilation time; an error should only occur when the query is executed, and the problematic expression is actually evaluated (if at all).

So, in your case, '201701' is constant-folded to integer 201701, but '201702A' becomes CONVERT_IMPLICIT(int,'201702A',0).

Constant folding is much more powerful and complete than the above simple example would suggest. For example:

LastName LIKE SUBSTRING(LEFT(NCHAR(UNICODE(NCHAR(68))), 1) + N'%', 1, 2)

is constant-folded to:

LastName LIKE N'D%'

In SQL Server 2012 and later, even deterministic SQLCLR scalar functions can be constant-folded.

Related Solutions

Sql-server – What exactly can SQL Server 2014 execute in batch mode

What exactly can run in batch mode as of SQL Server 2014?

SQL Server 2014 adds the following to the original list of batch mode operators:

Hash Outer join (including full join)
Hash Semi Join
Hash Anti Semi Join
Union All (Concatenation only)
Scalar hash aggregate (no group by)
Batch Hash Table Build removed

It seems that data can transition into batch mode even if it does not originate from a columnstore index.

SQL Server 2012 was very limited in its use of batch operators. Batch mode plans had a fixed shape, relied on heuristics, and could not restart batch mode once a transition to row-mode processing had been made.

SQL Server 2014 adds the execution mode (batch or row) to the query optimizer's general property framework, meaning it can consider transitioning into and out of batch mode at any point in the plan. Transitions are implemented by invisible execution mode adapters in the plan. These adapters have a cost associated with them to limit the number of transitions introduced during optimization. This new flexible model is known as Mixed Mode Execution.

The execution mode adapters can be seen in the optimizer's output (though sadly not in user-visible execution plans) with undocumented TF 8607. For example, the following was captured for a query counting rows in a row store:

Row to Batch to Row adapters

Is using a columnstore index a formal requirement that is necessary to make SQL Server consider batch mode?

It is today, yes. One possible reason for this restriction is that it naturally constrains batch mode processing to Enterprise Edition.

Could we maybe add a zero row dummy table with a columnstore index to induce batch mode?

Yes, this works. I have also seen people cross-joining with a single-row clustered columnstore index for just this reason. The suggestion you made in the comments to left join to a dummy columnstore table on false is terrific.

-- Demo the technique (no performance advantage in this case)
--
-- Row mode everywhere
SELECT COUNT_BIG(*) FROM dbo.FactOnlineSales AS FOS;
GO
-- Dummy columnstore table
CREATE TABLE dbo.Dummy (c1 int NULL);
CREATE CLUSTERED COLUMNSTORE INDEX c ON dbo.Dummy;
GO
-- Batch mode for the partial aggregate
SELECT COUNT_BIG(*) 
FROM dbo.FactOnlineSales AS FOS
LEFT OUTER JOIN dbo.Dummy AS D ON 0 = 1;

Plan with dummy left outer join:

Documentation is thin

True.

The best official sources of information are Columnstore Indexes Described and SQL Server Columnstore Performance Tuning.

SQL Server MVP Niko Neugebauer has a terrific series on columnstore in general here.

There are some good technical details about the 2014 changes in the Microsoft Research paper, Enhancements to SQL Server Column Stores (pdf) though this is not official product documentation.

Sql-server – Variable Sniffing

This behavior is document in the Query Hints topic:

When compiling query plans, the RECOMPILE query hint uses the current values of any local variables in the query and, if the query is inside a stored procedure, the current values passed to any parameters.

Best Answer

Related Solutions

Sql-server – What exactly can SQL Server 2014 execute in batch mode

Sql-server – Variable Sniffing

Related Question