Sql-server – Inline Scalar UDF Under RCSI – Can Results Differ

functionssql serversql-server-2019

SQL Server 2019 introduces Scalar UDF inlining, aka "Froid." This ".. embeds [scalar UDFs] in the calling SQL query."

Previously, scalar UDFs ran under their own execution context, separate from the surrounding query's. One consequence of this is that under read committed snapshot isolation (RCSI) the function could see a different set of values than the containing query sees (link).

Is it possible that a query containing a scalar function, when running in RCSI with concurrent writes, could produce different results depending on whether the function is in-lined or not?

Best Answer

Yes, an in-lined function can show different results than its out-lined (!?) counterpart. The following reliably reproduces the situation on my machine (Windows 10, 4 cores + HT @ 2GHz, 16GB RAM, SSD).

Configure the database and session to use Read Committed Snapshot Isolation (RCSI):

alter database Sandpit
set read_committed_snapshot on
with rollback immediate;
GO
set transaction isolation level read committed;
GO

This table will give a shared object on which concurrent workloads can act.

drop table if exists t;
go
create table t(c int);
insert t(c) values (1);
go

This table is to capture the results from the test, hopefully revealing divergent behaviour between functions that are in-lined and not:

drop table if exists #Out;
go
create table #Out(Base int, Old int, New int);
go

To demonstrate the different behaviour I want two function executed within a single SELECT, one of which in in-line and the other isn't. The documentation says

A scalar T-SQL UDF can be inline if .. the UDF does not invoke any intrinsic function .. such as GETDATE()

To ensure one UDF cannot be in-line I add a reference to GETDATE. Note that this additional statement plays no part in the logic of the UDF, it simply suppresses in-lining. (Indeed, this function could be optimised away. Perhaps some future release will implement just such a optimization?)

create or alter function dbo.Old_skool()
returns int
as
begin
    declare @tot int = 0;
    declare @d date = GETDATE();   -- inhibits in-lining
    select @tot = SUM(C) from t;
    return @tot;
end
go


create or alter function dbo.New_kid_on_the_block()
returns int
as
begin
    declare @tot int = 0;
    select @tot = SUM(C) from t;
    return @tot;
end
go

To reference the shared table I've arbitrarily chosen to use SUM. I believe, but haven't tested, that any other technique that surfaces differences in the rows seen by the functions and the containing SELECT (MIN, MAX, TOP(1)) would do just as well.

Next I start two sessions. The first is to perform the SELECT, the second to make concurrent writes against the shared table.

-- Session 1 for reads

set transaction isolation level read committed;
GO

truncate table #Out;

declare @c int = 0;

while @c < 99   -- large enough to exhibit the behaviour
begin
    insert #Out(Base, Old, New)
    select
        c,
        dbo.Old_skool(),
        dbo.New_kid_on_the_block()
    from t;

    set @c += 1;
end


-- Session 2 for writes

declare @c int = 0;

while @c < 99999
begin
    update t set c = c + 1;
    set @c += 1;
end

I set running the session performing writes. On my machine it runs for about 24s, which is ample time to switch to session 1 (the reads) and start it.

For one run over 99 SELECTs there are 12 instances where the in-line and traditional execution mechanism return different results. In every case the in-lined function returns the same result as the containing query (which is not to say this test shows such behaviour is guaranteed).

Base        Old         New
----------- ----------- -----------
1801        1802        1801
1803        1804        1803
1814        1815        1814
1841        1842        1841
1856        1857        1856
1857        1858        1857
1860        1861        1860
1861        1862        1861
1864        1865        1864
1883        1884        1883
1884        1885        1884
1890        1891        1890