SQL Server – How to Parse Column Names Using Ordinal Position from XML Data

sql serversql-server-2016t-sqlxml

I have several hundred tables in a database that have the same structure:

SomeId, Pos, Varying Number of Other Fields

So, for example, one table may look like this:

PersonId, Pos, Hobby, Degree
12345, 1, Basketball, Bachelor of Science
12345, 2, Baseball, Master of Science
12345, 3, Boxing, NULL
12345, 4, Tennis, NULL
22222, 1, Golf, Bachelor of Science
22222, 2, NULL, Master of Science
22222, 3, NULL, Doctorate

I want to roll up the values for every column 3-N. So this would become:

12345, "Basketball, Baseball, Boxing, Tennis",  "Bachelor of Science, Master of Science"
22222, "Golf", "Bachelor of Science, Master of Science, Doctorate"

Another table might look like this:

ClientId, Pos, Location, Language, CommunicationType
33333, 1, North Carolina, English, Phone
33333, 2, New York, Spanish, Email
33333, 3, NULL, Portuguese, NULL
44444, 1, California, English, Phone
44444, 2, NULL, NULL, Email

and become this:

33333, "North Carolina, New York", "English, Spanish, Portugeue", "Phone, Email"
44444, "California", "English", "Phone, Email"

What I would like to do is create a TVF where I can specify the table name and have the function return its fields. Ideally, rolled up like I just demonstrated above.

Solomon Rutzky provided a solution (SQL Server: Pass table name into table valued function as a parameter) where he showed how to use XML and CASE statements to accept table names in a TVF.

Here's an adaptation:

DECLARE @TableName sysname = 'objects'
/*
DECLARE @TableName sysname = 'columns'
DECLARE @TableName sysname = 'indexes'
*/
       
SELECT tab.BaseData.value(N'/row[1]/@name', N'VARCHAR(128)') AS [name],
       tab.BaseData.value(N'/row[1]/@object_id', N'BIGINT') AS [object_id],
       *
FROM (
    SELECT CASE @TableName
             WHEN N'objects' THEN (SELECT * FROM master.sys.tables FOR XML RAW, TYPE)
             WHEN N'indexes' THEN (SELECT * FROM master.sys.indexes FOR XML RAW, TYPE)
             WHEN N'columns' THEN (SELECT * FROM master.sys.columns FOR XML RAW, TYPE)
           END AS [BaseData]
     ) tab;

If I were to create a monster CASE statement and account for all possible incoming table names, is there a way to reference the columns by ordinal position (rather than name like I'm doing above)? Even better, roll up and delimit their values too (which is my ultimate goal)?

Thank you in advance!

Also, here's the DDL to create my two sample tables:

CREATE TABLE [dbo].[Person](
[PersonId] [int] NULL,
[Pos] [int] NULL,
[Hobby] [varchar](100) NULL,
[Degree] [varchar](50) NULL
)
GO
INSERT [dbo].[Person] ([PersonId], [Pos], [Hobby], [Degree]) VALUES (12345, 1, N'Basketball', N'Bachelor of Science')
GO
INSERT [dbo].[Person] ([PersonId], [Pos], [Hobby], [Degree]) VALUES (12345, 2, N'Baseball', N'Master of Science')
GO
INSERT [dbo].[Person] ([PersonId], [Pos], [Hobby], [Degree]) VALUES (12345, 3, N'Boxing', NULL)
GO
INSERT [dbo].[Person] ([PersonId], [Pos], [Hobby], [Degree]) VALUES (12345, 4, N'Tennis', NULL)
GO
INSERT [dbo].[Person] ([PersonId], [Pos], [Hobby], [Degree]) VALUES (22222, 1, N'Golf', N'Bachelor of Science')
GO
INSERT [dbo].[Person] ([PersonId], [Pos], [Hobby], [Degree]) VALUES (22222, 2, NULL, N'Master of Science')
GO
INSERT [dbo].[Person] ([PersonId], [Pos], [Hobby], [Degree]) VALUES (22222, 3, NULL, N'Doctorate')
GO

CREATE TABLE [dbo].[Client](
[ClientId] [int] NULL,
[Pos] [int] NULL,
[Location] [varchar](100) NULL,
[Language] [varchar](50) NULL,
[CommunicationType] [varchar](50) NULL
)
GO
INSERT [dbo].[Client] ([ClientId], [Pos], [Location], [Language], [CommunicationType]) VALUES (33333, 1, N'North Carolina', N'English', N'Phone')
GO
INSERT [dbo].[Client] ([ClientId], [Pos], [Location], [Language], [CommunicationType]) VALUES (33333, 2, N'New York', N'Spanish', N'Email')
GO
INSERT [dbo].[Client] ([ClientId], [Pos], [Location], [Language], [CommunicationType]) VALUES (33333, 3, NULL, N'Portuguese', NULL)
GO
INSERT [dbo].[Client] ([ClientId], [Pos], [Location], [Language], [CommunicationType]) VALUES (44444, 1, N'California', N'English', N'Phone')
GO
INSERT [dbo].[Client] ([ClientId], [Pos], [Location], [Language], [CommunicationType]) VALUES (44444, 2, NULL, NULL, N'Email')
GO

SELECT * FROM Person;
SELECT * FROM Client;

Best Answer

is there a way to reference the columns by ordinal position

Yes there is but I'm not sure how that would help you doing what you want. You put the ordinal position in the predicate, as you already do for row[1].

Changing '/row[1]/@name' to instead get the third column would look like '/row[1]/@*[3]'. You should be aware of that null values does not create any attributes so your data in the third attribute will not always come from the third column.

To fix that you could generate elements instead of attributes for column values and use XSINIL to get empty elements for null values in columns, ex: SELECT * FROM master.sys.indexes FOR XML RAW, ELEMENTS XSINIL, TYPE. Then you need to select the third element from the XML instead of the third attribute '/row[1]/*[3]'.

You are already on a path to "create a monster CASE statement and account for all possible incoming table names" so why not create a monster query that does what you want instead, without the XML stuff.

select T.PersonId as Id,
       '"' + string_agg(T.Hobby, ',') within group (order by T.Pos) + '", ' +
       '"' + string_agg(T.Degree, ',') within group (order by T.Pos) + '"' as Value
from dbo.Person as T
where @TableName = N'Person'
group by T.PersonId
union all
select T.ClientId,
       '"' + string_agg(T.Location, ',') within group (order by T.Pos) + '", ' +
       '"' + string_agg(T.Language, ',') within group (order by T.Pos) + '", ' +
       '"' + string_agg(T.CommunicationType, ',') within group (order by T.Pos) + '"'
from dbo.Client as T
where @TableName = N'Client'
group by T.ClientId;

You could use dynamic SQL against meta tables to generate the above query if you are in a situation where you need to update the function often or even automatically.

Since you are on SQL Server 2016 you don't have string_agg() you need to use for xml path to do the concatenation. The query got bigger but it is the same principle and can still be created using dynamic SQL.

select T.PersonId as Id,
       '"' + stuff((
                    select ', '+T2.Hobby 
                    from dbo.Person as T2 
                    where T.PersonId = T2.PersonId 
                    order by T2.Pos 
                    for xml path(''), type).value('text()[1]', 'nvarchar(max)'), 1, 2, '') + '", ' +
       '"' + stuff((
                    select ', '+T2.Degree 
                    from dbo.Person as T2
                    where T.PersonId = T2.PersonId 
                    order by T2.Pos 
                    for xml path(''), type).value('text()[1]', 'nvarchar(max)'), 1, 2, '') + '"' as Value
from dbo.Person as T
where @TableName = N'Person'
group by T.PersonId
union all
select T.ClientId,
       '"' + stuff((
                    select ', '+T2.Location 
                    from dbo.Client as T2 
                    where T.ClientId = T2.ClientId 
                    order by T2.Pos 
                    for xml path(''), type).value('text()[1]', 'nvarchar(max)'), 1, 2, '') + '", ' +
       '"' + stuff((
                    select ', '+T2.Language
                    from dbo.Client as T2 
                    where T.ClientId = T2.ClientId
                    order by T2.Pos 
                    for xml path(''), type).value('text()[1]', 'nvarchar(max)'), 1, 2, '') + '", ' +
       '"' + stuff((
                    select ', '+T2.CommunicationType
                    from dbo.Client as T2 
                    where T.ClientId = T2.ClientId
                    order by T2.Pos 
                    for xml path(''), type).value('text()[1]', 'nvarchar(max)'), 1, 2, '') + '"'
from dbo.Client as T
where @TableName = N'Client'
group by T.ClientId;