Sql-server – why the split string functions are not working with large OBJECT_DEFINITION

sql serversql-server-2008sql-server-2016string-searchingstring-splitting

I have in one of my databases a stored procedure

called PROCEDURE [repl].[usp_upd_repl_out_application], that is not exactly a very clever stored procedure but it has over 389 lines.

As I need to change this stored procedure, I like to back it up to a table before any alteration, I am aware of source control applications and that should be the way to go, but I am happy doing things my way and it works for me.

it might be another case of re-inventing the wheel, but I still would like to know why my split strings functions are failing.

So while attempting to backup this stored procedure
this stored procedure is object_id = 146815585 in my_database.

this works perfectly:

DECLARE 
  @olddelim nvarchar(32) = char(13) + Char(10),
  @newdelim nchar(1)     = NCHAR(9999); -- pencil (✏)
SELECT * 
  FROM STRING_SPLIT(REPLACE(OBJECT_DEFINITION(146815585)
  , @olddelim
  , @newdelim)
  , @newdelim);

this also works nicely, although it is not the way I want it:

if object_id('tempdb..#radhe') is not null
   drop table #radhe

create table #radhe(i int identity(1,1) not null primary key clustered, ln nvarchar(max))
insert into #radhe (ln)
exec sp_helptext '[repl].[usp_upd_repl_out_application]'

I have got this split function from here:

ALTER FUNCTION dbo.fn_TVF_Split(@arr AS NVARCHAR(max), @sep AS NCHAR(1))
RETURNS TABLE
WITH SCHEMABINDING,ENCRYPTION
AS
RETURN
WITH 
   L0 AS (SELECT 1 AS C UNION ALL SELECT 1)       --2 rows
  ,L1 AS (SELECT 1 AS C FROM L0 AS A, L0 AS B)    --4 rows (2x2)
  ,L2 AS (SELECT 1 AS C FROM L1 AS A, L1 AS B)    --16 rows (4x4)
  ,L3 AS (SELECT 1 AS C FROM L2 AS A, L2 AS B)    --256 rows (16x16)
  ,L4 AS (SELECT 1 AS C FROM L3 AS A, L3 AS B)    --65536 rows (256x256)
  ,L5 AS (SELECT 1 AS C FROM L4 AS A, L4 AS B)    --4,294,967,296 rows (65536x65536)
  ,Nums AS (SELECT row_number() OVER (ORDER BY (SELECT 0)) AS N FROM L5)  
SELECT
(n - 1) - LEN(REPLACE(LEFT(@arr, n-1), @sep, N'')) + 1 AS pos,
SUBSTRING(@arr, n, CHARINDEX(@sep, @arr + @sep, n) - n) AS element
FROM Nums
WHERE 
    n <= LEN(@arr) + 1
    AND SUBSTRING(@sep + @arr, n, 1) = @sep
    AND N<=1000
GO



DECLARE
@olddelim nvarchar(32) = Char(10) -- char(13) + Char(10), 
,@newdelim nchar(1)     = NCHAR(9999); -- pencil (?) 
SELECT m.*  
--INTO TableBackups.dbo._MMiorelli_20180305_repluspupdreploutapplication_140331_DB_ID_32 
from  dbo.fn_TVF_Split( OBJECT_DEFINITION ( 146815585),  @olddelim) m  

it only returns 18 rows out of the 398

and this one below, only returns 60 out of the 398.

--=============================================
-- this is the SplitString function 
-- for use when we're below sql 2016
--=============================================
alter FUNCTION dbo.SplitString

    (
        @List NVARCHAR(MAX),
        @Delim NVARCHAR(255)
    )
    RETURNS TABLE
    WITH ENCRYPTION
    AS
        RETURN ( SELECT [Value] FROM 
          ( 
            SELECT 
              [Value] = LTRIM(RTRIM(SUBSTRING(@List, [Number],
              CHARINDEX(@Delim, @List + @Delim, [Number]) - [Number])))
            FROM (SELECT Number = ROW_NUMBER() OVER (ORDER BY name)
              FROM sys.all_objects) AS x
              WHERE Number <= LEN(@List)
              AND SUBSTRING(@Delim + @List, [Number], LEN(@Delim)) = @Delim
          ) AS y
        );

this is how I call this function:

 DECLARE  
@olddelim nvarchar(32) = Char(10) -- char(13) + Char(10), 
,@newdelim nchar(1)     = NCHAR(9999); -- pencil (?) 

SELECT m.*  
from  master.dbo.splitstring( OBJECT_DEFINITION ( 146815585),  @olddelim) m  

returns 60 rows.

question:

is there a way I could modify my function master.dbo.splitstring so that it would return the whole 398 lines of the procedure?

Best Answer

I don't think your sys.all_objects has enough rows to make this work. It looks like your split routine is reading each individual character and concatenating it. You have more characters in your stored procedure than you have rows in sys.all_objects.

I created and populated a dbo.numbers table with 100,000 rows and modified your code to use that instead of sys.all_objects. I also included the row_number I was on to be able to see how the 'numbers' were being consumed.

You can really see what's going on if you run with output to text.

--drop table numbers
SELECT TOP 100000
        IDENTITY(INT,1,1) AS Number
   INTO dbo.numbers
   FROM Master.dbo.SysColumns sc1,
        Master.dbo.SysColumns sc2

--=============================================
-- this is the SplitString function 
-- for use when we're below sql 2016
--=============================================
alter FUNCTION dbo.SplitString

    (
        @List NVARCHAR(MAX),
        @Delim NVARCHAR(255)
    )
    RETURNS TABLE
    WITH ENCRYPTION
    AS
        RETURN ( SELECT * FROM 
          ( 
            SELECT 
            convert(varchar(10),number) as num,
              [Value] = LTRIM(RTRIM(SUBSTRING(@List, [Number],
              CHARINDEX(@Delim, @List + @Delim, [Number]) - [Number])))
            FROM (SELECT Number = ROW_NUMBER() OVER (ORDER BY number)
              FROM dbo.numbers) AS x
              WHERE Number <= LEN(@List)
              AND SUBSTRING(@Delim + @List, [Number], LEN(@Delim)) = @Delim
          ) AS y
        );

DECLARE  
@olddelim nvarchar(32) = Char(10) -- char(13) + Char(10), 
,@newdelim nchar(1)     = NCHAR(9999); -- pencil (?) 

SELECT m.*  
from  dbo.splitstring( OBJECT_DEFINITION ( 727113781),  @olddelim) m