Sql-server – SQL Split Row Data Separated by Spaces

splitsql serversql-server-2016substring

I am looking for a query
to find nth value in a list. The separator is anything greater than or equal to 2 spaces. (it can be 3, or 5 spaces).
Trying to avoid scalar value functions, since performance may be slower. The sentences can have any number of words, from 5-20.

CREATE TABLE dbo.TestWrite (TestWriteId int primary key identity(1,1), 
                            TextRow varchar(255))
INSERT INTO dbo.TestWrite (TextRow)
SELECT 'I am writing SQL Code.'
UNION ALL
SELECT 'SQL keywords include join, except, where.'


+-----+----------+---------+---------------+---------+----------+
| SQL | keywords | include |     join,     | except, |   where. |
+-----+----------+---------+---------------+---------+----------+
| I   | am       | writing |    SQL  Code. |         |          |
+-----+----------+---------+---------------+---------+----------+

Would like in individual rows with columns, see comments above.

This may be one solution trying to utilize.
https://stackoverflow.com/questions/19449492/using-t-sql-return-nth-delimited-element-from-a-string

DECLARE @dlmt NVARCHAR(10)=N' ';
DECLARE @pos INT = 2;
SELECT CAST(N'<x>' + REPLACE(@input,@dlmt,N'</x><x>') + N'</x>' AS XML).value('/x[sql:variable("@pos")][1]','nvarchar(max)')

Best Answer

Splitting to separate fields of a result set is a bit tricky if you have a varying number of elements per row. There is a SQLCLR stored procedure in the SQL# library (that I wrote) that, while not in the Free version, does break a delimited string of varying elements into result set fields based on a Regular Expression delimiter. The delimiter being RegEx makes it quite easy to treat "one or more spaces" as the delimiter: \s+. For example:

EXEC SQL#.String_SplitResultIntoFields N'
SELECT [TextRow] FROM #TestWrite;',
N'\s+', NULL, NULL;
/*
Field1    Field2     Field3     Field4    Field5
I         am         writing    SQL       Code.
SQL       keywords   include    join,     except,
*/

As you can see, it determines the number of fields for the result set based on the first row, which is why there is no "Field6" to contain the final word in the second row.

Of course, you could always seed the first row with dashes separated by spaces to force a certain number of fields, but there is no way to then filter out that initial row:

EXEC SQL#.String_SplitResultIntoFields N'
SELECT N''- - - - - - - -''
UNION ALL
SELECT [TextRow] FROM #TestWrite;',
N'\s+', NULL, NULL;
/*
Field1    Field2     Field3     Field4    Field5    Field6   Field7    Field8
-         -          -          -         -         -        -         -
I         am         writing    SQL       Code.
SQL       keywords   include    join,     except,   where.
*/

I suppose I can add an optional input parameter for @ForceResultSetFieldCount easily enough, but it's not there as of today.

If this request was only what is initially stated in the question (i.e. "I am looking for a query to find nth value in a list"), then that is not only trivial (even with the "one or more spaces" delimiter requirement), but that requires a regular expression function, RegEx_CaptureGroupCapture, and that is in the Free version of SQL#. For example:

SETUP

CREATE TABLE #TestWrite
(
  TestWriteId INT PRIMARY KEY IDENTITY(1, 1),
  TextRow VARCHAR(255)
);

INSERT INTO #TestWrite (TextRow)
  SELECT 'I am     writing SQL                   Code.'
  UNION ALL
  SELECT 'SQL  keywords include   join,    except, where.';

TESTS

As you can see below, you can either use a pattern of "one or more word characters", which will exclude both whitespace and punctuation (first example), or you can use a pattern of "one or more non-whitespace", which will include punctuation, etc (second example).

-- only get "word" characters:
SELECT SQL#.RegEx_CaptureGroupCapture(t.[TextRow], N'(\w+)', 1, 4, 1, NULL, 1, -1, NULL)
FROM   #TestWrite t;
/*
SQL
join
*/

-- get non-whitespace:
SELECT SQL#.RegEx_CaptureGroupCapture(t.[TextRow], N'([^\s]+)', 1, 4, 1, NULL, 1, -1, NULL)
FROM   #TestWrite t;
/*
SQL
join,
*/