Sql-server – How to select the set of last non-NULL values per column over a group

sql serverwindow functions

I am using SQL Server 2016 and the data I am consuming has the following form.

CREATE TABLE #tab (cat CHAR(1), t CHAR(2), val1 INT, val2 CHAR(1));

INSERT INTO #tab VALUES 
    ('A','Q1',2,NULL),('A','Q2',NULL,'P'),('A','Q3',1,NULL),('A','Q3',NULL,NULL),
    ('B','Q1',5,NULL),('B','Q2',NULL,'P'),('B','Q3',NULL,'C'),('B','Q3',10,NULL);

SELECT *
FROM    #tab;

enter image description here

I would like a obtain the last non-null values over columns val1 and val2 grouped by cat and ordered by t. The result I am seeking is

cat  val1 val2
A    1    P
B    10   C

The closest I have come is using LAST_VALUE while ignoring the ORDER BY which is not going to work since I need the ordered last non-null value.

SELECT DISTINCT 
        cat, 
        LAST_VALUE(val1) OVER(PARTITION BY cat ORDER BY (SELECT NULL) ) AS val1,
        LAST_VALUE(val2) OVER(PARTITION BY cat ORDER BY (SELECT NULL) ) AS val2
FROM    #tab
cat  val1 val2
A    NULL NULL
B    10   NULL

The actual table has more columns for cat (date and string columns) and more val columns (date, string, and number columns) to select the last non-null value.

Any ideas how to make this selection.

Best Answer

Using the concatenation technique from The Last non NULL Puzzle by Itzik Ben Gan would look like this with your sample table and column data types.

select T.cat,
       cast(substring(
                     max(cast(T.t as binary(2)) + cast(T.val1 as binary(4))),
                     3,
                     4
                     ) as int),
       cast(substring(
                     max(cast(T.t as binary(2)) + cast(T.val2 as binary(1))),
                     3,
                     1
                     ) as char(1))
from #tab as T
group by T.cat;

enter image description here

Another way to write this query that divides the steps into CTE's to perhaps better show what is going on. It gives the exact same execution plan as the query above.

with C1 as
(
  -- Concatenate the ordering column with the value column
  select T.cat,
        cast(T.t as binary(2)) + cast(T.val1 as binary(4)) as val1,
        cast(T.t as binary(2)) + cast(T.val2 as binary(1)) as val2
  from #tab as T
),
C2 as
(
  -- Get the max concatenated value per group
  select C1.cat,
         max(C1.val1) as val1,
         max(C1.val2) as val2
  from C1
  group by C1.cat
)
-- Extract the value from the concatenated column
select C2.cat,
       cast(substring(C2.val1, 3, 4) as int) as val1,
       cast(substring(C2.val2, 3, 1) as char(1)) as val2
from C2;

This solution uses the fact that concatenating a null value with something results in a null value. SET CONCAT_NULL_YIELDS_NULL (Transact-SQL)