Sql-server – Efficiency of Scalar UDF vs TVF

functionsoptimizationset-returning-functionssql server

I am trying to optimize the rollup code for my company and ran into a very peculiar issue. I converted many Scalar functions to be TVFs and they all seem to run more quickly than the original, which is great. However, in the queries that call them, they end up running significantly slower than the original. Here is a basic outline of my update:

SELECT col1, ..., colx,
    (CASE WHEN x <= 0 OR y <= 0 OR z <= 0 OR z = x 
          THEN output
        WHEN valX <= 0 
          THEN output
        WHEN minimum.min < 1.0 THEN 1.0
        ELSE minimum.min
        END) AS Q,
FROM Tbl1...tblx (series of inner joins)
CROSS APPLY dbo.inlinemin(val1, val2) AS minimum

This is a basic outline of the original:

SELECT col1, ..., colx,
    (CASE WHEN x <= 0 OR y <= 0 OR z <= 0 OR z = x 
          THEN output
        WHEN valX <= 0 
          THEN output
        ELSE maximum(minimum(val1,val2),1.0)
        END) AS Q,
FROM Tbl1...tblx (series of inner joins)

The numbers are more or less the same, as is the logic. The only difference is my function 'inlineMin' is a TVF as opposed to 'maximum' and 'minimum' the original Scalar functions. These functions are exceedingly simple and just return the max or min between the two passed parameters. Even the execution plan is more or less the same. There is a change from merge join to hash match at one point, however, the cost of this difference is minimal and could not account for the drastic change in elapsed time and cpu time.

When I run the functions outside of the rollup query my function is faster than the original for large sets of data. This makes sense given how TVFs work in comparison to scalar UDFs. However, when I call them in the query, my updated version runs roughly 6x slower. The cross apply is (seemingly) not the issue since leaving the cross apply and simply using the old functions

SELECT ...
    ELSE maximum(minimum(val1,val2),1.0)
    END) AS Q,
FROM Tbl1...tblx (series of inner joins)
CROSS APPLY inlinemin(val1, val2) AS NotUsedHere

Is roughly as efficient as the original code. It is only when I include the output of my function in the select that the query becomes significantly slower.

As I understand, the function is called and runs at the cross apply, meaning that it should be calculating a value even if it is not in the select, so why would it be faster not to include it in the select? Further, if the above is false, why would my function itself be faster but run significantly slower when used inside of a query?

Edit:

Here is the Inline TVF that I have written to replace the original

CREATE FUNCTION [dbo].[InlineMin](@val1 FLOAT, @val2 FLOAT)
RETURNS TABLE WITH SCHEMABINDING 
AS
RETURN
    SELECT minVal =
    CASE    WHEN @val1 < @val2
            THEN @val1
    ELSE
            ISNULL(@val2,@val1)
END

Here is the anonymized query plan for my rewrite:
https://www.brentozar.com/pastetheplan/?id=ryKR_Q0Em

and the anonymized query plan for the original:
https://www.brentozar.com/pastetheplan/?id=SJsS1-A4X

Best Answer

As I understand, the function is called and runs at the cross apply, meaning that it should be calculating a value even if it is not in the select, so why would it be faster not to include it in the select?

The optimizer is very good at removing subtrees that compute expressions that are not needed in the final result (top-level projection). When you remove the value from the select list, the work needed to compute that value is simply not done.

Further, if the above is false, why would my function itself be faster but run significantly slower when used inside of a query?

This is difficult to assess in detail from an anonymized plan. Nevertheless, removing the scalar T-SQL functions allows the optimizer to consider parallel plans. You might like to test your rewritten query with an OPTION (MAXDOP 1) query hint to see how the serial plan selected compares with your original.

Parallel plans are not always better (though they are only selected if they appear to be lower cost to the optimizer). Your case has a relatively low-cost, so the optimizer does not believe it is worth exploring a tremendous number of alternatives. There are cases where the time spent considering both serial and parallel plans is counter-productive to final plan quality.

I do apologise if this is a little vague, but anonymized plans really do make it tough to be specific. All things being equal, in-line functions will currently out-perform scalar functions. Sadly, all things are rarely equal.

Related Solutions

Sql-server – Why would call to scalar function inside a Table Value Function be slower than outside the TVF

Scalar functions are called once-per-row, when called as part of a query.

Consider the following example.

Create a new, blank database for our tests:

USE master;
IF EXISTS (SELECT 1 FROM sys.databases d WHERE d.name = 'mv')
BEGIN
    ALTER DATABASE mv SET SINGLE_USER WITH ROLLBACK IMMEDIATE;
    DROP DATABASE mv;
END
GO
CREATE DATABASE mv;
GO

Create a table, a multi-statement function, and a table-valued-function:

USE mv;
GO
CREATE TABLE dbo.t
(
    t_id int NOT NULL
        CONSTRAINT PK_t
        PRIMARY KEY CLUSTERED
);
GO

CREATE FUNCTION dbo.t_func
(
    @t_id int
)
RETURNS bit
WITH SCHEMABINDING
AS
BEGIN
    DECLARE @r bit;
    IF EXISTS (SELECT 1 FROM dbo.t WHERE t.t_id = @t_id)
        SET @r = 1
    ELSE
        SET @r = 0;
    RETURN @r;
END
GO

CREATE FUNCTION dbo.t_tvf
(
    @min_t_id int
    , @max_t_id int
)
RETURNS TABLE 
WITH SCHEMABINDING
AS
RETURN (
    SELECT t_id = t.t_id
        , e = dbo.t_func(dbo.t.t_id)
    FROM dbo.t
    WHERE t.t_id >= @min_t_id
        AND t.t_id <= @max_t_id
);
GO

Insert some sample data into the table:

INSERT INTO dbo.t (t_id)
SELECT ROW_NUMBER() OVER (ORDER BY c.id, c.colid)
FROM sys.syscolumns c;
GO

Create a table to store function execution stats, and populate it with a start-row showing execution counts for the multi-statement-function, t_func:

CREATE TABLE dbo.function_stats
(
    run_num int NOT NULL
    , object_name sysname NOT NULL
    , execution_count int NULL 
    , CONSTRAINT PK_function_stats
        PRIMARY KEY CLUSTERED (run_num, object_name)
);
GO
INSERT INTO dbo.function_stats (run_num, object_name, execution_count)
SELECT 1
    , o.name
    , COALESCE(fs.execution_count, 0)
FROM sys.objects o 
    LEFT JOIN sys.dm_exec_function_stats fs ON fs.object_id = o.object_id
WHERE o.name = 't_func';
GO

Run a query against the TVF:
```
SELECT t.*
FROM dbo.t_tvf(1, 2) t;
GO
```

Capture the execution stats now:

INSERT INTO dbo.function_stats (run_num, object_name, execution_count)
SELECT 2
    , o.name
    , COALESCE(fs.execution_count, 0)
FROM sys.objects o 
    LEFT JOIN sys.dm_exec_function_stats fs ON fs.object_id = o.object_id
WHERE o.name = 't_func';

The function stats results:

SELECT *
FROM dbo.function_stats fs
ORDER BY fs.run_num
    , fs.object_name;

╔═════════╦═════════════╦═════════════════╗
║ run_num ║ object_name ║ execution_count ║
╠═════════╬═════════════╬═════════════════╣
║       1 ║ t_func      ║               0 ║
║       2 ║ t_func      ║               2 ║
╚═════════╩═════════════╩═════════════════╝

As you can see, the multi-statement-function has execute twice, once per row for the source table accessed by the TVF.

I expect the mutli-statement-function is being called many, many times by the TVF, giving the impression that it is running slowly, whereas in fact it is simply being called many times.

Sql-server – CROSS APPLY on Scalar function

I read that this is not good practice because function is called 'zilion' times and it have bad impact on performance.

While CROSS APPLY can be useful in some cases, I don't expect any difference in performance between calling the function in WHERE or CROSS APPLY in the specific case. If the table has a million rows (and columns C and D possibly a million different values), a million times the function will be called. How can it be otherwise?

I tried to rewrite it with CROSS APPLY.

Here's how:

SELECT
    t.A,
    t.B,
    ca.Fc,
    ca.Fd,
    dbo.Func(t.E) AS Fe
    t.F,
FROM abcdef AS t
  CROSS APPLY 
    ( SELECT
          dbo.Func(t.C) AS Fc, 
          dbo.Func(t.D) AS Fd
    ) AS ca
WHERE 0 = ca.Fc + ca.Fd ;

or:

SELECT
...
FROM abcdef AS t
  CROSS APPLY 
    ( SELECT
          dbo.Func(t.C) AS Fc, 
          dbo.Func(t.D) AS Fd
      FROM (SELECT NULL) AS dummy
      WHERE 0 = dbo.Func(t.C) + dbo.Func(t.d) 
    ) AS ca ;

Again, I don't think this will have any effects on efficiency.

Best Answer

Related Solutions

Sql-server – Why would call to scalar function inside a Table Value Function be slower than outside the TVF

Sql-server – CROSS APPLY on Scalar function

Related Question