SQL Server – Invalid Length Parameter Passed to LEFT or SUBSTRING Function

sql serversql-server-2016t-sql

I have a process that needs to split a string of key-value pairs. To do so, I'm using a combination of a temp table that has computed columns to calculate the key and the value from each key-value pair, and a string_split function in the INSERT statement to separate the key-value pairs into rows. This is how the table looks like:

CREATE TABLE #KeyValuePairs (
  [PK]       uniqueidentifier NOT NULL,
  [KeyValue] varchar(500) NOT NULL,
  [Key] AS CONVERT(varchar(35), SUBSTRING([KeyValue],1,CHARINDEX('=',[KeyValue])-1)) PERSISTED NOT NULL,
  [Value] AS SUBSTRING([KeyValue],CHARINDEX('=',[KeyValue]) + 1, LEN([KeyValue]) - CHARINDEX('=',[KeyValue])) PERSISTED NULL
); 

The clustered index just helps me to find a specific combination of primary key and value key efficiently. The following is how I insert data into it:

INSERT #KeyValuePairs ([PK], [KeyValue])
SELECT T.[PK], kv.[value]
FROM BaseTable T
CROSS APPLY string_split([KeyValueStringColumn],'*') kv
WHERE [KeyValueStringColumn] != ''
  AND kv.[value] != ''

Occasionally, this query fails with the following error:

Invalid length parameter passed to the LEFT or SUBSTRING function.

If I remove the computation from the table definition, leaving the fields NULL by the INSERT and instead run an UPDATE afterwards with the exact same formula, the UPDATE succeeds always.

Best Answer

This is a similar issue to the one discussed here, but I think it deserves its own question/solution as in my case the issue was not the use of substring in a query, but in a table that's being inserted to. Though, the reason apparently is the same ... as Aaron Bertrand stated in his answer to another question:

... because you can't always rely on SQL Server filtering rows before attempting calculations.

While that's understandable in a case where the filter and the computation happen in the same query, it is not so obvious that the same appears to happen when inserting data into a table that has computed columns. Apparently, the persisted columns are computed before being inserted into the table, therefore (I guess depending on which plan the optimizer chose) also computing values of rows that due to the filter condition are not being inserted into the table. Changing the table definition making sure the length parameter cannot become negative solved the problem.

CREATE TABLE #KeyValuePairs (
  [PK]       uniqueidentifier NOT NULL,
  [KeyValue] varchar(500) NOT NULL,
  [Key] AS CONVERT(varchar(35), SUBSTRING([KeyValue],1,NULLIF(CHARINDEX('=',[KeyValue]),0)-1)) PERSISTED NOT NULL,
  [Value] AS SUBSTRING([KeyValue],CHARINDEX('=',[KeyValue]) + 1, LEN([KeyValue]) - NULLIF(CHARINDEX('=',[KeyValue]),0)) PERSISTED NULL