I have a SQL Server 2016 temporal table query that I’m having a hard time figuring out an efficient way to write.
Let’s say I have a temporal table called [Users] that each day has the total body of users in a group. Each day users can have a status of ('A', 'B', 'C', 'Y', 'Z')
If you wanted to select the current version of the table I would simply write
SELECT [UserID]
, [Status] --where the user status would be listed as 'A', 'B', 'C', 'Y', or 'Z'
, [InsertDate] --record creation date which isn't related to the System-versioning fields
, [SysStartTime] --<start_date_time> which could be omitted from the simple query
, [SysEndTime] --<end_date_time> which could be omitted from the simple query
FROM [Users]
If I wanted to look up how the table looked on 6am of October 2nd, 2017 I know could write:
SELECT [UserID]
, [Status]
, [InsertDate]
, [SysStartTime]
, [SysEndTime]
FROM [Users]
FOR SYSTEM_TIME AS OF '2017-10-02 06:00:00'
Easy enough and I could count the number of people with various statuses using a simple SUM function.
Where my problem begins is when I want to analyse trends over a longer period. In theory I want a picture of how the table looked on each day of that extended timespan. Like if I was looking at 20 contiguous days of data with the [Users] table having on average 5000 rows per a day I would hope that my query would return about 100000 records.
My initial thought was a to pass in a subquery to the 'FOR…AS OF' statement but SQL Server does not seem to like that. I have tried a bunch of things many of which look something like this:
SELECT [UserID]
, [Status]
, [InsertDate]
, [SysStartTime]
, [SysEndTime]
FROM [Users]
FOR SYSTEM_TIME AS OF (SELECT Fulldate FROM SetOfDates_Table)
Am I missing something obvious about how to pass a set of date values to 'AS OF'? Should I instead be using the 'SELECT..FROM..FOR..AS OF' as a function, pass a parameter to it then UNION the resulting data sets together?
Just for reference, I'm in a data warehousing environment where I'd want to schedule the query to happen in advance rather than needing it to run on demand.
Best Answer
While
SYSTEM_TIME AS OF
allows the use of literals and variables, it apparently does not allow you to use a correlated query to pass in different dates from something like a date dimension table.With that in mind, here is a solution that uses a multi-statement Table-Valued User-Defined Function (TVF). It takes in a StartDate and EndDate and uses a while loop to populate the return table. In general, multi-statement TVF's usually don't perform as well as inline TVF's, but since you said you could schedule this query and not have to run on demand, the performance may still be acceptable.
Here, we create a temporal table called Customer. We load 1 row and then have a series of update statements spaced 2 seconds apart to reflect some history
Now, we define the multi-statement TVF
Notice that the
WHILE
loop is incrementing the @DateWork variable by 1 second (easier to demo), but you can defined the granularity as a day.Also, remember that temporal Start and End dates are UTC, so you'll need to make adjustments for that.
I ran my example at 7:00am Eastern Time (United States) which converts to UTC time of 11:00am