Sql-server – At which point does SQL Server validate data for insertion

insertsql server

I have a query to insert data from one table into another. One of the source columns is larger than the destination table's, although the actual data selected in the query isn't. So e.g. the source table is varchar(25), the destination table is varchar(10), and the query selects records that do not exceed 10 characters for that specific column.

My initial query failed with a truncation error, but when I rewrote it selecting the same records but changing the where clause i.e. to use WHERE IN and filtering on a cte it ran successfully.

i.e instead of doing

select * from tbl where col1 = 1

I did:

with cte as (select 1) select * from tbl where col1 in (select 1 from cte)

I understand that this is a bug in SQL Server, still I am wondering at which point does SQL Server validate the data for insertion. At which point is the error thrown?

Best Answer

In all honesty, it depends on exactly how the query is processed, and it is certainly possible that changing the query (even though it yields the same end results) can make expressions (like attempted conversions required to fit a big string into a small column) happen before filters have eliminated those rows (and a different plan shape can occur simply because of the syntax, because stats have changed, because indexes have been added, etc. etc.).

Erland Sommarskog has always called those "illogical errors" - this UserVoice issue was migrated from a Connect item that was originally posted at least a decade ago.

In your specific case, there are multiple failsafes you can use here, assuming small_col is a varchar(10) and large_col is something that can't have more than 10 characters when col1 = 1:

INSERT dbo.target(small_col)
  SELECT CASE col1 
    WHEN 1 THEN CONVERT(varchar(10), large_col) END
  FROM dbo.source
  WHERE col1 = 1;

In fact an additional failsafe, depending on the data types involved, would be to use TRY_CONVERT() instead of CONVERT(), in SQL Server 2012 and up. I don't think that applies in this specific case, since virtually any string type can be converted to varchar(10), but might be useful in other cases.

There was a similar but not identical question in the past, where I added other examples and links to external references:

I also wrote about this in #1 here:

Anyway, to answer your explicit question, the error is only thrown when the query is processed in such a way that an expression is attempted on the violating rows before they are filtered out. This is why you tend to hear a lot of us old school folks lecturing about always using the right data type, not overloading columns, etc.