Sql-server – TSQL – create a function that takes a multi-row query result as an argument

functionssql serversql server 2014

I work with a variety of servers on mostly SQL Server 2014 and earlier.

I have boiler-plate code that can take the results of an arbitrary select statement (returning a list of varchar or the like) and transform them into a single comma separated string. I use this regularly on reports (for e.g. listing the agents associated with a single call).

I'm getting tired of retyping it.

How can I create a function (or stored procedure, or other thing I don't know about) to re-use this code neatly?

Ideal useage would be something like:

select @result = udf_makeCSF( (select [First Name] + [Last Name] from AgentTable agents where ... ) )

I've seen some suggestions about using exec for similar problems and that gives me the stone-cold horrors.

EDIT: To make it clear, I already know how to make a comma-separated list. That's the boiler-plate code which I'm getting tired of copy-pasting everywhere. So please don't get hung up on explaining that, that's a solved problem.

But to extend my example, to make it more concrete for folks:

If I have CallSegments table, like so:

pkey    callid    start time     duration    time to answer    handling agent   caller     ...
1       1         8:00:20        260         12                12               5551234    ...
2       1         8:04:35        380         3                 16               5551234    ...
3       2         9:08:16        512         36                12               5554321    ...
4       3         9:58:42        8           1                 12               5557890    ...

This table represents incoming activity at a call center. Each row is a single segment of a call – A single call may have multiple segments, such as when it is transferred to another agent or another queue, a conference call is started, etc.

This is simplifying the data I'm working with, it better represents the state of affairs halfway through my procedure.

I also have an Agents table, like so

pkey    first name     last name    hire date    ...
12      Alice          Smith        1900-01-01   ...
14      Bob            Jones        2000-12-31   ...

This represents individual call-center agents. The only information I need out of this (for this report) is names of handling agent(s)

I am tasked with producing a report on calls where I assemble the various segments into a single line per call summarizing the lifecycle of the call.

Example desired output would be something like this:

start time     duration    time to answer    handling agents           caller     ...
8:00:20        00:10:40    00:00:15          Alice Smith, Bob Jones    555-1234   ...
9:08:16        00:08:32    00:00:36          Alice Smith               555-4321   ...
9:58:42        00:00:08    00:00:01          Charley Carpenter         555-7890   ...

Collecting the statistics is a group by and a couple of SUM()s. Collecting the handling agents is not.

This is not my exact scenario every time, but it's variations on this depending on report requirements (agent names is common, but also what queues did this call go through, and similar). The similarity is having a column of data that needs to get concatenated with a separator (usually commas) and returned as a single string.

Which feels to me like making this a shared snipped of code to be re-used rather than retyped repeatedly is good. I would write a custom aggregation method in C# and be done with it, but most of the client sites I work at do not allow for this.

Best Answer

Really what you are looking to do is inline concatenation / aggregation. The two easiest methods are:

  1. SQLCLR (works with SQL Server 2005 and newer, but not Azure SQL Database unless it is the new Managed Instance type)
  2. The STRING_AGG aggregate function (works with SQL Server 2017 and newer, and works on Azure SQL Database)

Currently, neither of these options works for you:

  • Option #1 is prohibited by policy, and (or due to) the fact that the client(s) will soon be migrating to Azure SQL Database which does not support SQLCLR (anymore; it did for about 18 months).
  • Option #2 is not possible due to being on SQL Server 2014 which does not have that function

This might be a long-shot, but given that you are only prohibited from using SQLCLR by policy, and that they will be migrating to Azure SQL Datbase, you might could do the following:

  1. Short-term (prior to migrating to Azure SQL Database): use a SQLCLR aggregate function
  2. Long-term (upon migrating to Azure SQL Database): swap out the SQLCLR aggregate for STRING_AGG

While this does entail modifying queries upon migration, the modification is a simple replacement. Assuming that all stored procedures, functions, etc are stored in a repository (e.g. Git, SVN, etc), it should be fairly easy to do a mass replacement across multiple files. Then deploy, test, and commit to the repository.

For example (using the SQL# library that I wrote which has Agg_Join in the Free version):

SELECT so.[schema_id],
       COUNT(*) AS [NumObjects],
       SQL#.Agg_Join(DISTINCT RTRIM(so.[type])) AS [Types]
FROM   sys.objects so
GROUP BY so.[schema_id];


SELECT so.[schema_id],
       COUNT(*) AS [NumObjects],
       STRING_AGG(RTRIM(so.[type]), ',')
              WITHIN GROUP (ORDER BY RTRIM(so.[type])) AS [Types]
FROM   sys.objects so
GROUP BY so.[schema_id];

These don't return identical results sets unless you remove the DISTINCT from SQL#.Agg_Join(DISTINCT RTRIM(so.[type])). I kept that in there to show a case where it helps, and to point out that for some reason, DISTINCT is not supported in the STRING_AGG function :(.

If you can do a Regular Expression (RegEx) replace, you can swap out:

SQL#.Agg_Join({stuff})

for:

STRING_AGG({stuff}, ',') [WITHIN GROUP (ORDER BY {stuff})]

with minimal effort. The WITHIN GROUP part is optional (I added it there since the DISTINCT in the SQLCLR version did the ordering). The main point being that the queries are otherwise untouched and should produce the same results, hence not needing a massive retesting effort.

In fact, if you create your own SQLCLR UDA such that it has the second parameter for the delimiter (the SQL# Agg_Join doesn't such that it can work in SQL Server 2005), you could name it "String_Agg". This would reduce the find/replace to be replacing dbo.String_Agg( with String_Agg(.