SQL Server – Retrieve Last Not-Null Value in Ordered Column

I have the following input:

 id | value 
----+-------
  1 |   136
  2 |  NULL
  3 |   650
  4 |  NULL
  5 |  NULL
  6 |  NULL
  7 |   954
  8 |  NULL
  9 |   104
 10 |  NULL

I expect the following result:

 id | value 
----+-------
  1 |   136
  2 |   136
  3 |   650
  4 |   650
  5 |   650
  6 |   650
  7 |   954
  8 |   954
  9 |   104
 10 |   104

The trivial solution would be join the tables with a < relation, and then selecting the MAX value in a GROUP BY:

WITH tmp AS (
  SELECT t2.id, MAX(t1.id) AS lastKnownId
  FROM t t1, t t2
  WHERE
    t1.value IS NOT NULL
    AND
    t2.id >= t1.id
  GROUP BY t2.id
)
SELECT
  tmp.id, t.value
FROM t, tmp
WHERE t.id = tmp.lastKnownId;

However, the trivial execution of this code would create internally the square of the count of the rows of the input table ( O(n^2) ). I expected t-sql to optimize it out – on a block/record level, the task to do is very easy and linear, essentially a for loop ( O(n) ).

However, on my experiments, the latest MS SQL 2016 can't optimize this query correctly, making this query impossible to execute for a large input table.

Furthermore, the query has to run quickly, making a similarly easy (but very different) cursor-based solution infeasible.

Using some memory-backed temporary table could be a good compromise, but I am not sure if it can be run significantly quicker, considered that my example query using subqueries didn't work.

I am also thinking on to dig out some windowing function from the t-sql docs, what could be tricked to do what I want. For example, cumulative sum is doing some very similar, but I couldn't trick it to give the latest non-null element, and not the sum of the elements before.

The ideal solution would be a quick query without procedural code or temporary tables. Alternatively, also a solution with temporary tables is okay, but iterating the table procedurally is not.

DROP TABLE IF EXISTS dbo.Example; CREATE TABLE dbo.Example ( id integer PRIMARY KEY, val integer NULL ); INSERT dbo.Example (id, val) VALUES (1, 136), (2, NULL), (3, 650), (4, NULL), (5, NULL), (6, NULL), (7, 954), (8, NULL), (9, 104), (10, NULL); SELECT E.id, E.val, lastval = CAST( SUBSTRING( MAX(CAST(E.id AS binary(4)) + CAST(E.val AS binary(4))) OVER ( ORDER BY E.id ROWS UNBOUNDED PRECEDING), 5, 4) AS integer) FROM dbo.Example AS E ORDER BY E.id;

SQL Server – Retrieve Last Not-Null Value in Ordered Column

Best Answer

Related Question

Best Answer

Related Solutions

Sql-server – TSQL change column constraint between null and not null

SQL Server – How to Get the First or Last Row in Ordered Result Set by Column Value

Related Question