Sql-server – Temporal tables: history table population

sql serversql-server-2016temporal-tables

According to the prominent diagram here, the main table in a temporal table set-up is populated during the insert, and the history table is populated during update or delete.

However when I look at a query plan inserting or merging a temporal table tblWhatever, considerable cost is associated with the operator Table Insert [tblWhateverHistory] and when I re-create the clustered index on the history table, dropped before populating the temporal table, the allocated storage jumps considerably.

It all leads me to believe, that the history table is populated with a copy of every record in the temporal table initially, but those history copies are not accessible until and unless the temporal records are touched by updates or deletes. Can anyone confirm or prove me wrong?

Best Answer

I wanted to see if I could reproduce your observation of activity on the history table during inserts to the temporal table. To isolate the two tables I will place each on its own filegroup. I started with a new database in its own directory:

use master;
go

alter database DB219238 set single_user with rollback immediate;
drop database if exists DB219238;
go


CREATE DATABASE DB219238
ON PRIMARY 
    -- Primary filegroup gets the default values
    ( NAME = N'DB219238', 
      FILENAME = N'D:\Database\Data\DB219238\DB219238.mdf' , 
      SIZE = 3MB , 
      MAXSIZE = UNLIMITED, 
      FILEGROWTH = 65536KB ), 
FILEGROUP [DATA]  DEFAULT
    -- This will hold the temporal table
    ( NAME = N'Data', 
      FILENAME = N'D:\Database\Data\DB219238\Data.ndf' , 
      SIZE = 1MB ,            -- deliberately small so I can observer growth events
      MAXSIZE = UNLIMITED, 
      FILEGROWTH = 1MB ),     -- deliberately small
FILEGROUP [History]
    -- This will hold the history table
    ( NAME = N'History', 
      FILENAME = N'D:\Database\Data\DB219238\History.ndf' , 
      SIZE = 1MB ,            -- deliberately small, as above
      MAXSIZE = UNLIMITED, 
      FILEGROWTH = 1MB )      -- deliberately small
LOG ON 
    -- Just the defaults
    ( NAME = N'DB219238_log', 
      FILENAME = N'D:\Database\Log\DB219238_log.ldf' , 
      SIZE = 8192KB , 
      MAXSIZE = 2048GB , 
      FILEGROWTH = 65536KB );
GO

-- I don't want to be concerned with log growth    
ALTER DATABASE DB219238 SET RECOVERY SIMPLE;
GO

Then I created a temporal table and its history table.

use DB219238;
go

create table dbo.Example_History
(
    ExampleId           int             not NULL,
    Information         char(4100)      not NULL,
    SysStartTimeUTC     datetime2(7)    not NULL,
    SysEndTimeUTC       datetime2(7)    not NULL,
) on History;    -- note the filegroup
GO
checkpoint;
go


create table dbo.Example
(
    ExampleId                           int IDENTITY(1,1)   not NULL,
    Information                         char(4100)          not NULL,

    SysStartTimeUTC                     datetime2(7) GENERATED ALWAYS AS ROW START  not NULL,
    SysEndTimeUTC                       datetime2(7) GENERATED ALWAYS AS ROW END    not NULL,

    constraint PK_Example primary key clustered
    (
        ExampleId ASC
    ),

    period for system_time (SysStartTimeUTC, SysEndTimeUTC)
) on Data    -- note the filegroup
with
(
    SYSTEM_VERSIONING = ON ( HISTORY_TABLE = dbo.Example_History )
);
GO
checkpoint;
go

The Information column is just over half a page long and fixed length so one row consumes a page and relatively few rows can cause a lot of page allocation. At this point the File Explorer shows the directory thus:

Now I populate the temporal table:

insert dbo.Example
(
    Information
)
select
    '';
GO 10000    -- arbitrary, but proved sufficient to demonstrate the case
checkpoint;
go

The only file growth has been on the temporal table. Or at least write activity on the History filegroup is less than the 1MB initially allocated to it.

As a further test I ran SysInternals Process Monitor (PM) while performing the inserts. The checkpoint statements throughout the code are to ensure activity is flushed to disk so PM can observe it. I captured all events on the directory holding the DB's files. It registered 145 WriteFile operations, amongst others, against Data.ndf and no operations at all against History.ndf.

The execution plan for the INSERT is this:

It shows no activity on the history table, only the temporal table.

Building a clustered index on the history table again showed no activity on History.ndf in Process Monitor.

create clustered index IX_History on dbo.Example_History (ExampleId);
checkpoint;

I think this disproves your hypothesis, at least for my minimal setup. I'm not surprised to see file growth on a clustered index build. In SQL Server the clustered index leaf pages are the data. So to get the data in clustered sequence the storage engine copies the data from its current location to its new. Likely there would be a lot of free space left in that file after the clustered index had build, being the space where the data used to be.

I cannot guess why you saw the insert operators referencing the history table.

Related Solutions

Sql-server – Do temporal tables log changes when there are none

Converting my comment to answer ..

If the value is actually updated, then it will log changes. If you update the same value, new versions for every row will be recorded, but the column values will remain the same.

Below is the working example :

CREATE TABLE dbo.blah1
    -- PK is a requirement !!
    (
    foo INT identity(1, 1) PRIMARY KEY
    ,baa INT
    ,SysStartTime DATETIME2 GENERATED ALWAYS AS ROW START NOT NULL
    ,SysEndTime DATETIME2 GENERATED ALWAYS AS ROW END NOT NULL
    ,PERIOD FOR SYSTEM_TIME(SysStartTime, SysEndTime)
    )
    WITH (SYSTEM_VERSIONING = ON);

-- insert data
insert into dbo.blah (baa)
values (1)

Now check for the data

-- check data
    SELECT * FROM  dbo.blah;
    select * from [dbo].[MSSQL_TemporalHistoryFor_629577281]

perform dummy update

update blah 
set baa = baa

Perform onemore dummy update .. there will be rows with same values but different systime - start and end.

You have to have a proper mechanism to clean up history table as that can grow significantly larger if you forget to trim it.

Sql-server – Using SQL Server’s 2016 Temporal Tables to Record Prior History

Currently temporal supports system time, which means the date/time values placed in those start/end time columns are populated by the system.

In a future version, temporal will also support application time, which will provide ways to have the application dictate the time and, I expect, to update existing data with real start/end times.

The only way to override this really would be to archive data from the history table into your own table (which you may end up doing over time to keep the history table from growing forever anyway), change the times there, and use that table as part of your queries.

Best Answer

Related Solutions

Sql-server – Do temporal tables log changes when there are none

Sql-server – Using SQL Server’s 2016 Temporal Tables to Record Prior History

Related Question