SQL Server Temporal Tables – Improving Performance on Older Values

performancesql servertemporal-tables

I'm encountering a strange issue occurring when accessing historical records within a temporal table. Queries that access the older entries in the temporal table via the AS OF sub-clause take longer than queries on recent historical entries.

The historical table was generated by SQL Server (includes a clustered index on the date columns and uses page compression), I have added 50 million rows to the historical table, and my queries were retrieving about 25,000 rows.

I have tried to determine the root cause of the issue but have not been able to identify it. So far I have tested:

  • Creating a test table with 50 million rows with a clustered index to see if the slow down was simply due to volume. I was able to retrieve 25K rows at constant time (~400ms).
  • Removing page compression from the historical table. That had no effect on the retrieval time but did significantly increase the size of the table.
  • I tried accessing the rows of the history table directly using an ID column vs the date columns. This is where things were a bit more interesting. I could access older rows in the table at ~400ms where as with the AS OF sub clause it would take ~1200ms. I tried filtering on my test table on the date column and noticed a similar slowdown when compared to filtering on the ID column. This leads me to believe that the date comparisons are behind some of the slowdown.

I want to look at this more but I also want to make sure that I am not barking up the wrong tree. First, has anyone else experienced this same behavior when accessing older historical data in a temporal table (we only noticed slow downs passed 10 million rows)? Second, what are some strategies I can use to further isolate the root cause of the performance issue (I just started looking into execution plans but it is still a bit cryptic to me)?

Execution plans

These are simple retrieval queries: the first accesses older rows, the second accesses newer rows.

Older Rows ~1200ms execution time

Recent Rows ~350ms execution time

Table details

These are the columns in the temporal table. The history table has the same columns but does not have a primary key (as per the history table requirements):
Temporal table column

Below are the indices on the history table:
Indices on the history table

Best Answer

In a comment from Zane on your question, he stated:

...It seems like part of your problem is you're reading 50 million rows in order to return 20K in the plan.

This is, indeed, the problem. There's no index available to push some, or all, of the predicates down to the storage engine. Microsoft recommends this baseline indexing strategy for temporal tables in the Docs article Temporal Table Considerations and Limitations:

An optimal indexing strategy will include a clustered columns store index and / or a B-tree rowstore index on the current table and a clustered columnstore index on the history table for optimal storage size and performance. If you create / use your own history table, we strongly recommend that you create this type of index consisting of period columns starting with the end of period column to speed up temporal querying as well as speeding up the queries that are part of the data consistency check. The default history table has a clustered rowstore index created for you based on the period columns (end, start). At a minimum, a non-clustered rowstore index is recommended

The phrasing of that is a little confusing (to me, anyway). But the takeaway is that you could create these indexes to improve performance some, if not quite a lot:

NC index on the current table, leading with SysEndTime:

CREATE NONCLUSTERED INDEX IX_SysEndTime_SysStartTime 
ON dbo.Benefits (SysEndTime, SysStartTime)
/*INCLUDE (ideally, include your other important fields here)*/;

This will allow you to avoid reading some of the rows in the current table by seeking to the appropriate end time.

CCI on the history table

CREATE CLUSTERED COLUMNSTORE INDEX ix_BenefitsHistory
ON dbo.BenefitsHistory
WITH (DROP_EXISTING = ON);

This will let you get batch mode on the history table, which should make the scans much faster.

NC index on the current table, leading with SysStartTime:

See Paul's answer to the question Most Efficient Way to Retrieve Date Ranges for more details on why indexing for date range queries is hard. Based on the logic there, it makes sense to add another NC index on the current table that leads with SysStartTime, so that the optimizer can choose which one to use based on statistics and the specific parameters of your query:

CREATE NONCLUSTERED INDEX IX_SysStartTime_SysEndTime
ON dbo.Benefits (SysStartTime, SysEndTime)
/*INCLUDE (ideally, include your other important fields here)*/;

Creating the 3 indexes outlined above made a significant difference in resource usage in my test cases. I set up a test case which runs two queries that return 1.5 million total rows. Both the history and current tables have 50 million rows).

Note: To reduce SSMS overhead, I ran the test with "Discard results after execution" option enabled.

Execution Plan - Default Indexes

Logical reads: 1,330,612
CPU time: 00:00:14.718
Elapsed time: 00:00:06.198

Execution Plan - With Indexes Described Above

Logical reads: 27,656 (8,111 row store + 19,545 columnstore)
CPU time: 00:00:01.828
Elapsed time: 00:00:01.150

As you can see, all 3 measures dropped significantly - including total elapsed time, from 6 seconds to 1 second.


The other option presented by the Docs article is to forgo the two NC indexes on the current table in favor of a clustered columnstore index. In my test, performance was very similar to the indexing solution described above.