Sql-server – SQL Server 2017 – Extracting source and target column names within a string containing column mappings

parsesql-server-2017stringstring manipulation

I have the following format string stored in text (could be any number of columns):

col1_source|col1_target;col2_source|col2_target;col3_source|col3_target;...

I'm trying to come up with an elegant way of extracting and isolating all the xxx_source column names and all the xxx_target column names so I could store them in variables and get the following end result:

@Source_Columns = 'col1_source,col2_source,col3_source'
@Target_Columns = 'col1_target,col2_target,col3_target'

At the end of the day, I'd like to perform SELECTs on my source and target columns to perform data compares.

This is what I've achieved so far, but I find it's just too complex for nothing (with a table valued function):

CREATE FUNCTION [dbo].[UF_miscParseStringToTable]
    (
        @list   nvarchar(MAX)
        , @sep  varchar(8) 
    ) 
RETURNS @ts table
    (
        [ID]        int identity
        , [value]   nvarchar(MAX)
    )
AS 
BEGIN 

    -- Parameters check
    if ((@sep is null) or (datalength(@sep) < 1)) return 

    if ((@list is null) or (@list = '') or (@list = @sep)) return

    -- Add path wildcards directly with sep
    -- ?worth it?
    if (left(@sep, 1) <> '%')   set @sep = '%' + @sep
    if (right(@sep, 1) <> '%')  set @sep = @sep + '%'

    -- First first sep
    declare @i int
    set @i = patindex(@sep, @list)

    -- Acc values
    while (@i > 0) begin
        insert into @ts ([value]) values (rtrim(left(@list, @i - 1)))
        set @list = ltrim(right(RTRIM(@list), len(@list) + 3 - (@i + len(@sep) )))
        set @i = patindex(@sep, @list)
    end

    set @list = rtrim(@list)
    -- Insert last value, if any
    if (@list <> '') insert into @ts (value) values (@list)

    return

END

The function above basically takes my mapping string and converts it to a list of column names in a table (see query logic below):

DECLARE @Delim varchar(1) = '|'
DECLARE @Mapping varchar(max) = 'col1_source|col1_target;col2_source|col2_target;col3_source|col3_target'  
DECLARE @String varchar(max) = REPLACE(@Mapping,';', @Delim)
SELECT * FROM dbo.UF_miscParseStringToTable(@String, @Delim)

The above resulting query yields the following table:

 ID| value
 1 | col1_source
 2 | col1_target
 3 | col2_source
 4 | col2_target
 5 | col3_source
 6 | col3_target

I could perhaps do a join on the column indexes but, I'm finding it difficult to isolate my source and target fields so that I could perform data comparisons between them. In addition, I'd like to avoid performing an extra join to a table if I don't have to.

Here are the results desired (to be able to perform the following):

SELECT col1_source, col2_source, col3_source FROM mytable;

SELECT col1_target, col2_target, col3_target FROM mytable;

Any help or ideas would be great!

Shawn

Best Answer

You did not specify your RDBMS - my solution uses SQL Server.

Your question states that you have a formatted string stored in text

col1_source|col1_target;col2_source|col2_target;col3_source|col3_target;...

You further state

I'm trying to come up with an elegant way of extracting and isolating all the xxx_source column names and all the xxx_target column names so I could store them in variables and get the following end result:

@Source_Columns = 'col1_source,col2_source,col3_source' @Target_Columns = 'col1_target,col2_target,col3_target'


My solution uses SQL Server 2016 STRING_SPLIT. If you're not up to that version, there are other methods to split the string.

drop table if exists #temp
Declare @String varchar(100) = 'col1_source|col1_target;col2_source|col2_target;col3_source|col3_target;'
----------------------------
Declare @SourceColumns varchar(max)
Declare @TargetColumns varchar(max)

--Split on semicolon and use PARSENAME to extract the source and target column information
SELECT @SourceColumns = isnull(@SourceColumns + ',', '') + SourceColumn
    ,@TargetColumns = isnull(@TargetColumns + ',', '') + TargetColumn
FROM (
    SELECT parsename(replace(value, '|', '.'), 2) AS SourceColumn
        ,parsename(replace(value, '|', '.'), 1) AS TargetColumn
    FROM (
        SELECT *
        FROM String_split(@string, ';')
        ) a
    WHERE value <> ' '
    ) a

SELECT @SourceColumns

col1_source,col2_source,col3_source

SELECT @TargetColumns

col1_target,col2_target,col3_target