Sql-server – Multi-statement TVF vs Inline TVF Performance

set-returning-functionssql serversql-server-2012

Comparing some of the answers on the Palindrome question(10k+ users only, since I've deleted the answer), I'm getting confusing results.

I proposed a multi-statement, schema-bound TVF which I thought would be faster than running a standard function, which it is. I was also under the impression that the multi-statement TVF would be "inlined", although I am wrong on that count, as you'll see below. This question is about the performance difference of those two styles of TVF. First, you'll need to see the code.

Here is the multi-statement TVF:

IF OBJECT_ID('dbo.IsPalindrome') IS NOT NULL
DROP FUNCTION dbo.IsPalindrome;
GO
CREATE FUNCTION dbo.IsPalindrome
(
    @Word NVARCHAR(500)
) 
RETURNS @t TABLE
(
    IsPalindrome BIT NOT NULL
)
WITH SCHEMABINDING
AS
BEGIN
    DECLARE @IsPalindrome BIT;
    DECLARE @LeftChunk NVARCHAR(250);
    DECLARE @RightChunk NVARCHAR(250);
    DECLARE @StrLen INT;
    DECLARE @Pos INT;
    SET @RightChunk = '';
    SET @IsPalindrome = 0;
    SET @StrLen = LEN(@Word) / 2;
    IF @StrLen % 2 = 1 SET @StrLen = @StrLen - 1;
    SET @Pos = LEN(@Word);
    SET @LeftChunk = LEFT(@Word, @StrLen);
    WHILE @Pos > (LEN(@Word) - @StrLen)
    BEGIN
        SET @RightChunk = @RightChunk + SUBSTRING(@Word, @Pos, 1)
        SET @Pos = @Pos - 1;
    END
    IF @LeftChunk = @RightChunk SET @IsPalindrome = 1;
    INSERT INTO @t VALUES (@IsPalindrome);
    RETURN
END
GO

The inline-TVF:

IF OBJECT_ID('dbo.InlineIsPalindrome') IS NOT NULL
DROP FUNCTION dbo.InlineIsPalindrome;
GO
CREATE FUNCTION dbo.InlineIsPalindrome
(
    @Word NVARCHAR(500)
)
RETURNS TABLE
WITH SCHEMABINDING
AS RETURN (
    WITH Nums AS
    (
      SELECT
        N = number
      FROM
        dbo.Numbers
    )
    SELECT
      IsPalindrome =
        CASE
          WHEN EXISTS
          (
            SELECT N
            FROM Nums
            WHERE N <= L / 2
              AND SUBSTRING(S, N, 1) <> SUBSTRING(S, 1 + L - N, 1)
          )
          THEN 0
          ELSE 1
        END
    FROM
      (SELECT LTRIM(RTRIM(@Word)), LEN(@Word)) AS v (S, L)
);
GO

The Numbers table in the above function is defined as:

CREATE TABLE dbo.Numbers
(
    Number INT NOT NULL 
);

Note: The numbers table does not have any indexes and no primary key, and contains 1,000,000 rows.

A test-bed temporary-table:

IF OBJECT_ID('tempdb.dbo.#Words') IS NOT NULL
DROP TABLE #Words;
GO
CREATE TABLE #Words 
(
    Word VARCHAR(500) NOT NULL
);

INSERT INTO #Words(Word) 
SELECT o.name + REVERSE(w.name)
FROM sys.objects o
CROSS APPLY (
    SELECT o.name
    FROM sys.objects o
) w;

On my test system the above INSERT results in 16,900 rows being inserted into the #Words table.

To test the two variations, I SET STATISTICS IO, TIME ON; and use the following:

SELECT w.Word
    , p.IsPalindrome
FROM #Words w
    CROSS APPLY dbo.IsPalindrome(w.Word) p
ORDER BY w.Word;


SELECT w.Word
    , p.IsPalindrome
FROM #Words w
    CROSS APPLY dbo.InlineIsPalindrome(w.Word) p
ORDER BY w.Word;

I expected the InlineIsPalindrome version to be significantly faster, however the following results do not support that supposition.

Multi-statement TVF:

Table '#A1CE04C3'. Scan count 16896, logical reads 16900, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table '#Words'. Scan count 1, logical reads 88, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

SQL Server Execution Times:
CPU time = 1700 ms, elapsed time = 2022 ms.
SQL Server parse and compile time:
CPU time = 0 ms, elapsed time = 0 ms.

Inline TVF:

Table 'Numbers'. Scan count 1, logical reads 1272030, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table '#Words'. Scan count 1, logical reads 88, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

SQL Server Execution Times:
CPU time = 137874 ms, elapsed time = 139415 ms.
SQL Server parse and compile time:
CPU time = 0 ms, elapsed time = 0 ms.

The execution plans look like:

enter image description here

enter image description here

Why is the inline variant so much slower than the multi-statement variant, in this case?

In response to a comment by @AaronBertrand, I've modified the dbo.InlineIsPalindrome function to limit the rows returned by the CTE to match the length of the input word:

CREATE FUNCTION dbo.InlineIsPalindrome
(
    @Word NVARCHAR(500)
)
RETURNS TABLE
WITH SCHEMABINDING
AS RETURN (
    WITH Nums AS
    (
      SELECT
        N = number
      FROM
        dbo.Numbers
      WHERE 
        number <= LEN(@Word)
    )
    SELECT
      IsPalindrome =
        CASE
          WHEN EXISTS
          (
            SELECT N
            FROM Nums
            WHERE N <= L / 2
              AND SUBSTRING(S, N, 1) <> SUBSTRING(S, 1 + L - N, 1)
          )
          THEN 0
          ELSE 1
        END
    FROM
      (SELECT LTRIM(RTRIM(@Word)), LEN(@Word)) AS v (S, L)
);

As @MartinSmith suggested, I've added a primary key and clustered index to the dbo.Numbers table, which certainly helps and would be closer to what one would expect to see in a production environment.

Re-running the tests above now results in the following stats:

CROSS APPLY dbo.IsPalindrome(w.Word) p:

(17424 row(s) affected)
Table '#B1104853'. Scan count 17420, logical reads 17424, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table '#Words'. Scan count 1, logical reads 90, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

SQL Server Execution Times:
CPU time = 1763 ms, elapsed time = 2192 ms.

dbo.FunctionIsPalindrome(w.Word):

(17424 row(s) affected)
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table '#Words'. Scan count 1, logical reads 90, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

SQL Server Execution Times:
CPU time = 328 ms, elapsed time = 424 ms.

CROSS APPLY dbo.InlineIsPalindrome(w.Word) p:

(17424 row(s) affected)
Table 'Numbers'. Scan count 1, logical reads 237100, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table '#Words'. Scan count 1, logical reads 90, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

SQL Server Execution Times:
CPU time = 17737 ms, elapsed time = 17946 ms.

I'm testing this on SQL Server 2012 SP3, v11.0.6020, Developer Edition.

Here is the definition of my numbers table, with the primary key and clustered index:

CREATE TABLE dbo.Numbers
(
    Number INT NOT NULL 
        CONSTRAINT PK_Numbers
        PRIMARY KEY CLUSTERED
);

;WITH n AS
(
    SELECT v.n 
    FROM (
        VALUES (1) 
            ,(2) 
            ,(3) 
            ,(4) 
            ,(5) 
            ,(6) 
            ,(7) 
            ,(8) 
            ,(9) 
            ,(10)
        ) v(n)
)
INSERT INTO dbo.Numbers(Number)
SELECT ROW_NUMBER() OVER (ORDER BY n1.n)
FROM n n1
    , n n2
    , n n3
    , n n4
    , n n5
    , n n6;

Best Answer

Your numbers table is a heap and is potentially being fully scanned each time.

Add a clustered primary key on Number and try the following with a forceseek hint to get the desired seek.

As far as I can tell this hint is needed as SQL Server just estimates that 27% of the table will match the predicate (30% for the <= and reduced down to 27% by the <>). And therefore that it will only have to read 3-4 rows before finding one that matches and it can exit the semi join. So the scan option is costed very cheaply. But in fact if any palindromes do exist then it will have to read the whole table so this is not a good plan.

CREATE FUNCTION dbo.InlineIsPalindrome
(
    @Word NVARCHAR(500)
)
RETURNS TABLE
WITH SCHEMABINDING
AS RETURN (
    WITH Nums AS
    (
      SELECT
        N = number
      FROM
        dbo.Numbers WITH(FORCESEEK)
    )
    SELECT
      IsPalindrome =
        CASE
          WHEN EXISTS
          (
            SELECT N
            FROM Nums
            WHERE N <= L / 2
              AND SUBSTRING(S, N, 1) <> SUBSTRING(S, 1 + L - N, 1)
          )
          THEN 0
          ELSE 1
        END
    FROM
      (SELECT LTRIM(RTRIM(@Word)), LEN(@Word)) AS v (S, L)
);
GO

With those changes in place it flies for me (takes 228ms)

enter image description here