Sql-server – Split String Function

sql serversql-server-2012stringstring-splitting

I have a question regarding this Split String function Aaron Bertrand recommends here.

With it you can extract information between two slashes. Now if I want to extract not only one piece of info like the servername, but also a second piece like the day of the week. How would I have to rewrite the code in order to get that info in one result set?

Here's a sample of my data

E:\BaseData\RUK\HPP_Conversion_Detail\hpp_conversion_detail_report_2015_11_02.csv
E:\BaseData\RUK\Manual_Review\manual_review_report_2015_11_01.csv
E:\BaseData\RUK\Disputes\dispute_report_2015_11_01.csv
E:\BaseData\RSE\HPP_Conversion_Detail\hpp_conversion_detail_report_2015_11_02.csv
E:\BaseData\RSE\Manual_Review\manual_review_report_2015_11_01.csv
E:\BaseData\RSE\Disputes\dispute_report_2015_11_01.csv

I'm interested in everything between the second and third slash and the third and fourth slash. What Geoff suggested is a good idea. I'm then getting the results in rows. Is it possible to get the results back in additional columns?

Thanks!

Best Answer

Using what Aaron Bertrand recommends here.

CREATE FUNCTION dbo.SplitStringsOrdered
(
    @List       NVARCHAR(2000),
    @Delimiter  NVARCHAR(32)
)
RETURNS TABLE
AS
    RETURN 
    (
      SELECT rn = ROW_NUMBER() OVER (ORDER BY Number), Item 
        FROM (SELECT Number, Item = LTRIM(RTRIM(SUBSTRING(@List, Number, 
          CHARINDEX(@Delimiter, @List + @Delimiter, Number) - Number)))
        FROM (SELECT ROW_NUMBER() OVER (ORDER BY [object_id])
          FROM sys.all_objects) AS n(Number)
        WHERE Number <= CONVERT(INT, LEN(@List))
        AND SUBSTRING(@Delimiter + @List, Number, LEN(@Delimiter)) = @Delimiter
      ) AS y);

And your example data

CREATE TABLE #x
(filepath NVARCHAR(2000));

INSERT #x VALUES
('E:\BaseData\RUK\HPP_Conversion_Detail\hpp_conversion_detail_report_2015_11_02.csv'),
('E:\BaseData\RUK\Manual_Review\manual_review_report_2015_11_01.csv'),
('E:\BaseData\RUK\Disputes\dispute_report_2015_11_01.csv'),
('E:\BaseData\RSE\HPP_Conversion_Detail\hpp_conversion_detail_report_2015_11_02.csv'),
('E:\BaseData\RSE\Manual_Review\manual_review_report_2015_11_01.csv'),
('E:\BaseData\RSE\Disputes\dispute_report_2015_11_01.csv')

Then you can run something like

SELECT * FROM  #x x
CROSS   APPLY dbo.SplitStringsOrdered(x.filepath, '\') sso
WHERE sso.rn = 3

and get out the third part.

There are several ways to get at multiple parts. You could just drop the WHERE sso.rn = 3 from the end but the data becomes spread over lots of rows which might not work for you.

Something like this might work for you using sub-queries.

SELECT *
, Part3 = (SELECT sso.Item FROM dbo.SplitStringsOrdered(x.filepath, '\') sso
WHERE sso.rn = 3) 
, Part5 = (SELECT sso.Item FROM dbo.SplitStringsOrdered(x.filepath, '\') sso
WHERE sso.rn = 5) 
FROM  #x x

WARNING above the code highlighting mistakenly thinks that the \ escapes the single quote following it which is not in fact the case in SSMS/SQL-server unless you ask for that behavior.

If you want everything a hack using MIN Pivot might be the best for you.

;
WITH    splits
          AS (
               SELECT   x.filepath
                      , sso.rn
                      , sso.Item
               FROM     #x x
               CROSS APPLY dbo.SplitStringsOrdered(x.filepath, '\') sso
             )
             SELECT * 
             FROM
             (  SELECT  splits.filepath
          , splits.rn
          , splits.Item
    FROM    splits   )  AS src
    PIVOT ( MIN(Item)
    FOR rn IN ([1],[2],[3],[4],[5])
           )   AS pvt

But that ends up looking a bit crazy, isn't in fact as flexible or as useful as it might appear, and the aggregate MIN is ugly.

So you could go for a mix of the two without the pivot hack and with more sub query.

;WITH   splits
          AS (
               SELECT   x.filepath
                      , sso.rn
                      , sso.Item
               FROM     #x x
               CROSS APPLY dbo.SplitStringsOrdered(x.filepath, '\') sso
             )
    SELECT  x.filepath
          , (
              SELECT s.Item FROM splits s WHERE s.rn = 2 AND s.filepath = x.filepath
            )
          , (
              SELECT s.Item FROM splits s WHERE s.rn = 3 AND s.filepath = x.filepath
            )
          , (
              SELECT    s.Item
              FROM      splits s
              WHERE     s.rn = (
                                 SELECT MAX (splits.rn) FROM splits
                               )
                        AND s.filepath = x.filepath
            )
    FROM    #x x