Sql-server – Is any caching mechamism used for non-persisted computed columns

cachecomputed-columnsql server

Does SQL Server cache the result of a non-persisted computed column so it can be reused without incurring cost of recalculation?

Additional Context
I've always given the advice that if using computed columns you should use the PERSISTED option unless you're expecting a higher insert/update frequency than read frequency, or if you need better performance around inserts/updates than you do reads of the data (i.e. since the cost of calculation has to be incurred on one or the other, so the decision is really where you want to pay that cost).

There's also the consideration of the additional storage for the computed data, but typically that's pretty negligible and cheap, so not much of a consideration.

However, I wanted to check if my advice is entirely accurate, since SQL may be more intelligent… i.e. Once SQL's calculated a computed column, it could record this value in memory so that it doesn't have to recalculate the value on subsequent queries. SQL could have a timestamp on the cached value and another on the underlying record data to say if that record's been changed since the computed value was calculated to determine whether the cached value is still valid.

Is there anything like this in place / does it depend on available resources (e.g. memory), or other factors (e.g. does the cached value have a TTL beyond that of the process lifetime)? I've never read anything implying that this exists, but I'd be surprised if there weren't some optimization going on under the covers.

Best Answer

The non-persisted computed column values are not cached in memory.

Here's a brief-ish demo that confirms this. First I'll create a database for our testing:

CREATE DATABASE CC_BufferPool_Test;
GO
USE [CC_BufferPool_Test];
GO

And then create a table without a computed column as a baseline. The column sizes in this table are specifically chosen so that 4 rows will exactly fill one 8KB data page (and thus one 1 page in SQL Server's buffer cache):

CREATE TABLE T
(
    C1 INT NOT NULL IDENTITY PRIMARY KEY,
    Filler CHAR(2011) NOT NULL
);
GO

INSERT INTO T (Filler) VALUES(REPLICATE('A', 2011));
GO 4

Next I'll create the same table, but with the addition of a computed column:

CREATE TABLE T_WithCC
(
    C1 INT NOT NULL IDENTITY PRIMARY KEY,
    C2 AS 'Row #' + CONVERT(CHAR(8), C1, 1),
    Filler CHAR(2011) NOT NULL
);
GO

INSERT INTO T_WithCC(Filler) VALUES(REPLICATE('A', 2011));
GO 4

Next, I'll make sure these 8 rows are written to disk, and then clear the buffer cache, and then select all 8 rows so that they get pulled into memory:

CHECKPOINT
DBCC DROPCLEANBUFFERS;

SELECT * FROM T;
SELECT * FROM dbo.T_WithCC;

Finally, I'll use this query that I totally did not steal from Aaron Bertrand*

pages in memory per object

From that, you can see there is only one page in memory for each of those two tables. If the computed column were stored in memory, it would have pushed the second table's rows onto a second page.

You can also see the calculation occurring each time in the actual execution plan:

compute scalar in actual execution plan

If you do need to "cache" non-persisted computed columns for specific queries, you can actually create a nonclustered index on them. That way they are stored in the index, but not on the base table.

*see Determine SQL Server memory use by database and object, and here's the code:

;WITH src AS
(
    SELECT
        [Object] = o.name,
        [Type] = o.type_desc,
        [Index] = COALESCE(i.name, ''),
        [Index_Type] = i.type_desc,
        p.[object_id],
        p.index_id,
        au.allocation_unit_id
    FROM sys.partitions AS p
        INNER JOIN sys.allocation_units AS au
            ON p.hobt_id = au.container_id
        INNER JOIN sys.objects AS o
            ON p.[object_id] = o.[object_id]
        INNER JOIN sys.indexes AS i
            ON o.[object_id] = i.[object_id]
            AND p.index_id = i.index_id
    WHERE
        au.[type] IN (1,2,3)
        AND o.is_ms_shipped = 0
)
SELECT
    src.[Object],
    src.[Type],
    src.[Index],
    src.Index_Type,
    buffer_pages = COUNT_BIG(b.page_id),
    buffer_mb = COUNT_BIG(b.page_id) / 128
FROM src
    INNER JOIN sys.dm_os_buffer_descriptors AS b
        ON src.allocation_unit_id = b.allocation_unit_id
WHERE b.database_id = DB_ID()
GROUP BY
    src.[Object],
    src.[Type],
    src.[Index],
    src.Index_Type
ORDER BY buffer_pages DESC;