SQL Server Pivot – How to Merge Multiple Rows into Fewest Rows of Distinct Values

pivotsql serversql-server-2012

In SQL Server does anyone know of a nice way to merge/flatten multiple rows of data into the fewest possible rows containing only the distinct non null values.

I.e.

A dataset like this:
before

As this:
after

If it helps, the before dataset is a pivoted line listing but without the aggregate. I can't aggregate it during the pivot as I want to keep each of the distinct values and not take the MAX or MIN.

The only way I can think of doing it involves splitting the data up and joining it all back together which wont be very efficient.

Best Answer

Your data appears to lack any relationship between the various column values. If you can define this relationship, you can PIVOT the data appropriately.

For example, if you simply want to align the data based on the order of the value (based on your default collation), you could use:

with rawdata as (
select * from (values
    ('00000000-0000-0000-0000-000000037850','Col2','Yes_02')
    ,('00000000-0000-0000-0000-000000037850','Col3','Full marketing schedule')
    ,('00000000-0000-0000-0000-000000037850','Col3','Negotiations started, fell through')
    ,('00000000-0000-0000-0000-000000037850','Col3','No budget')
    ,('00000000-0000-0000-0000-000000037850','Col3','Not interest')
    ,('00000000-0000-0000-0000-000000037850','Col3','Passed to Summerhouse')
    ,('00000000-0000-0000-0000-000000037850','Col4','Darren Waters_01')
    ,('00000000-0000-0000-0000-000000037850','Col4','David Edwards_01')
    ,('00000000-0000-0000-0000-000000037850','Col4','David Simons_01')
    ,('00000000-0000-0000-0000-000000037850','Col4','Jason Gould_01')
    ,('00000000-0000-0000-0000-000000037850','Col4','Martin Thorpe_01')
    ,('00000000-0000-0000-0000-000000037850','Col5','BETT New Exhibitor')
    ,('00000000-0000-0000-0000-000000037850','Col5','BETT Standard Exhibitor')
    ,('00000000-0000-0000-0000-000000037850','Col5','Exhibitor Verified')
    ) x ([ID],[Col],[Value])
    ), ordered as (
select
    ID
    ,Col
    ,[Value]
    ,rn = row_number() over (partition by ID, Col order by [Value])
    from rawdata
    )
select
    ID
    ,[Col1],[Col2],[Col3],[Col4],[Col5]
    from ordered o
    pivot(max([Value]) for Col in ([Col1],[Col2],[Col3],[Col4],[Col5])) pvt
    ;