Sql-server – Find “second-greatest” string in each “group”

sql server

I'm doing some data analysis and want to find an easy way to examine all the members of each "group" in a group by function.

Like, 3 agents may be involved in an order. I want to quickly examine the three agents that were 'grouped' in this order for various reasons.

Usually, I would use group_concat for this (easy way to see all grouped strings). However replicating that using a 'group by' appears difficult and unwieldy for now in SQL sever.

Right now, rough-and-dirty, I would max(username) and min(username) to quickly find 2 (and 90% of orders probably have 2 or less people. Is there a way to do mid(username) or 2nd-highest(username), or percentile(50th, username)? That would be a great, quick way to find this relevant data. For some reason, the previous answers I've seen describing group_concat on SQL Server do not sound straightforward to me.

Sample data for instance:

employee  purchase_id
bill        1
bob         1
chrissy     1
mike        2
bill        2
bob         3

Currently I have this:

purchase_id, employee_count, complicated metric
1                 3              blahblah
2                 2                dsflsajf
3                 1                98%

I would like to see at a glance:

purchase_id, employees, complicated metric
1            (bill,bob,chrissy)   blahblah

However the group_concat seems very confusing to use with a group by statement – or simulating group_concat with SQL Server. So instead, how bout this.

select max(employee), min(employee)

purchase_id,  max(employee), min(employee)
1              bill            chrissy

in the example you see that bob is omitted, as max/ min will only find the two endpoints. If there was some kind of function to pull the second highest value, or 50th percentile value, on strings, that would be helpful.

Best Answer

Something like this would split the results into multiple columns, but you would need to know in advance the maximum number of employees per purchase_id:

With Ordered_cte As ( 
    Select employee,
        purchase_id,
        RowNo = Row_Number() Over (Partition By purchase_id Order By employee)
      From tbl_purchase_employee)
Select purchase_id,
    Employee1 = Max(iif(RowNo = 1, employee, Null)),
    Employee2 = Max(iif(RowNo = 2, employee, Null)),
    Employee3 = Max(iif(RowNo = 3, employee, Null))
  From Ordered_cte
  Group By purchase_id;