Sql-server – way to prevent Scalar UDFs in computed columns from inhibiting parallelism

computed-columnfunctionsparallelismsql server

Much has been written about the perils of Scalar UDFs in SQL Server. A casual search will return oodles of results.

There are some places where a Scalar UDF is the only option, though.

As an example: when dealing with XML: XQuery can't be used as a computed column definition. One option documented by Microsoft is to use a Scalar UDF to encapsulate your XQuery in a Scalar UDF, and then use it in a computed column.

This has various effects, and some workarounds.

Executes row by row when the table is queried
Forces all queries against the table to run serially

You can get around the row-by-row execution by schemabinding the function, and either persisting the computed column, or indexing it. Neither of those methods can prevent the forced serialization of queries hitting the table, even when the scalar UDF isn't referenced.

Is there a known way to do that?

Best Answer

Yes if you:

are running SQL Server 2014 or later; and
are able to run the query with trace flag 176 active; and
the computed column is PERSISTED

Specifically, at least the following versions are required:

Cumulative Update 2 for SQL Server 2016 SP1
Cumulative Update 4 for SQL Server 2016 RTM
Cumulative Update 6 for SQL Server 2014 SP2

BUT to avoid a bug (ref for 2014, and for 2016 and 2017) introduced in those fixes, instead apply:

The trace flag is effective as a start-up –T option, at both global and session scope using DBCC TRACEON, and per query with OPTION (QUERYTRACEON) or a plan guide.

Trace flag 176 prevents persisted computed column expansion.

The initial metadata load performed when compiling a query brings in all columns, not just those directly referenced. This makes all computed column definitions available for matching, which is generally a good thing.

As an unfortunate side-effect, if one of the loaded (computed) columns uses a scalar user-defined function, its presence disables parallelism for the whole query, even when the computed column is not actually used.

Trace flag 176 helps with this, if the column is persisted, by not loading the definition (since expansion is skipped). This way, a scalar user-defined function is never present in the compiling query tree, so parallelism is not disabled.

The main drawback of trace flag 176 (aside from being only lightly documented) is that it also prevents query expression matching to persisted computed columns: If the query contains an expression matching a persisted computed column, trace flag 176 will prevent the expression being replaced by a reference to the computed column.

For more details, see my SQLPerformance.com article, Properly Persisted Computed Columns.

Since the question mentions XML, as an alternative to promoting values using a computed column and scalar function, you could also look at using a Selective XML Index, as you wrote about in Selective XML Indexes: Not Bad At All.

Explanation

The real question is why the optimizer felt the need to retrieve A, B, and C for the index seek at all. We would expect it to read the Comp column using a nonclustered index scan, and then perform a seek on the same index (alias T2) to locate the Top 1 record.

The query optimizer expands computed column references before optimization begins, to give it a chance to assess the costs of various query plans. For some queries, expanding the definition of a computed column allows the optimizer to find more efficient plans.

When the optimizer encounters a correlated subquery, it attempts to 'unroll it' to a form it finds easier to reason about. If it cannot find a more effective simplification, it resorts to rewriting the correlated subquery as an apply (a correlated join):

Apply rewrite

It just so happens that this apply unrolling puts the logical query tree into a form that does not work well with project normalization (a later stage that looks to match general expressions to computed columns, among other things).

In your case, the way the query is written interacts with internal details of the optimizer such that the expanded expression definition is not matched back to the computed column, and you end up with a seek that references columns A, B, and C instead of the computed column, Comp. This is the root cause.

Workaround

One idea to workaround this side-effect is to write the query as an apply manually:

SELECT
    T1.ID,
    T1.Comp,
    T1.D,
    CA.D2
FROM dbo.T AS T1
CROSS APPLY
(  
    SELECT TOP (1)
        D2 = T2.D
    FROM dbo.T AS T2
    WHERE
        T2.Comp = T1.Comp
        AND T2.D > T1.D
    ORDER BY
        T2.D ASC
) AS CA
WHERE
    T1.D IS NOT NULL -- DON'T CARE ABOUT INACTIVE RECORDS
ORDER BY
    T1.Comp;

Unfortunately, this query will not use the filtered index as we would hope either. The inequality test on column D inside the apply rejects NULLs, so the apparently redundant predicate WHERE T1.D IS NOT NULL is optimized away.

Without that explicit predicate, the filtered index matching logic decides it cannot use the filtered index. There are a number of ways to work around this second side-effect, but the easiest is probably to change the cross apply to an outer apply (mirroring the logic of the rewrite the optimizer performed earlier on the correlated subquery):

SELECT
    T1.ID,
    T1.Comp,
    T1.D,
    CA.D2
FROM dbo.T AS T1
OUTER APPLY
(  
    SELECT TOP (1)
        D2 = T2.D
    FROM dbo.T AS T2
    WHERE
        T2.Comp = T1.Comp
        AND T2.D > T1.D
    ORDER BY
        T2.D ASC
) AS CA
WHERE
    T1.D IS NOT NULL -- DON'T CARE ABOUT INACTIVE RECORDS
ORDER BY
    T1.Comp;

Now the optimizer does not need to use the apply rewrite itself (so the computed column matching works as expected) and the predicate is not optimized away either, so the filtered index can be used for both data access operations, and the seek uses the Comp column on both sides:

Outer Apply Plan

This would generally be preferred over adding A, B, and C as INCLUDEd columns in the filtered index, because it addresses the root cause of the problem, and does not require widening the index unnecessarily.

Persisted computed columns

As a side note, it is not necessary to mark the computed column as PERSISTED, if you don't mind repeating its definition in a CHECK constraint:

CREATE TABLE dbo.T 
(   
    ID integer IDENTITY(1, 1) NOT NULL,
    A varchar(20) NOT NULL,
    B varchar(20) NOT NULL,
    C varchar(20) NOT NULL,
    D date NULL,
    E varchar(20) NULL,
    Comp AS A + '-' + B + '-' + C,

    CONSTRAINT CK_T_Comp_NotNull
        CHECK (A + '-' + B + '-' + C IS NOT NULL),

    CONSTRAINT PK_T_ID 
        PRIMARY KEY (ID)
);

CREATE NONCLUSTERED INDEX IX_T_Comp_D
ON dbo.T (Comp, D) 
WHERE D IS NOT NULL;

The computed column is only required to be PERSISTED in this case if you want to use a NOT NULL constraint or to reference the Comp column directly (instead of repeating its definition) in a CHECK constraint.

Best Answer

Related Solutions

Sql-server – Applying user-defined fields to arbitrary entities

Sql-server – Index on Persisted Computed column needs key lookup to get columns in the computed expression

Explanation

Workaround

Persisted computed columns

Related Question