Sql-server – Track child records versioning and query performance

database-designperformancequery-performancesql server 2014

We have a database with versioned data and we are trying to find out a faster to query data.

There is a Model table and a few child tables (eg: Template Table) having references to Model. When there is a change in a Child table, it creates a new Model version and creates a new record in the child table.

In the following example there are 3 changes for the First Model ( so 3 versions). In the Template Table there are 4 different templates belongs to the First model.

Model Table

+----+--------+-----------+
| ID | Name   | VersionID |
+----+--------+-----------+
| 1  | First  | 1         |
+----+--------+-----------+
| 2  | Second | 1         |
+----+--------+-----------+
| 3  | Third  | 1         |
+----+--------+-----------+
| 1  | First  | 2         |
+----+--------+-----------+
| 1  | First  | 3         |
+----+--------+-----------+

When there is template change for a model, it creates a new record in the Template table just for that Template.

Template Table

+----+---------+----------------+------+
| ID | ModelID | ModelVersionID | Data |
+----+---------+----------------+------+
| 1  | 1       | 1              | xxx  |
+----+---------+----------------+------+
| 2  | 1       | 1              | xxx  |
+----+---------+----------------+------+
| 3  | 1       | 1              | xxx  |
+----+---------+----------------+------+
| 4  | 1       | 1              | xxx  |
+----+---------+----------------+------+
| 5  | 2       | 1              | xxx  |
+----+---------+----------------+------+
| 6  | 3       | 1              | xxx  |
+----+---------+----------------+------+
| 1  | 1       | 2              | xxx  |
+----+---------+----------------+------+
| 2  | 1       | 2              | xxx  |
+----+---------+----------------+------+
| 6  | 1       | 2              | xxx  |
+----+---------+----------------+------+
| 1  | 1       | 3              | xxx  |
+----+---------+----------------+------+

When I want to load the latest Templates for model version 3. I have to write a query to get the latest version of Template for ModelVersionId <= 3 (I am not sure how to write that query properly in a single sql statement)

Following templates should be the result for Model version 3.

+----+---------+----------------+------+
| ID | ModelID | ModelVersionID | Data |
+----+---------+----------------+------+
| 3  | 1       | 1              | xxx  |
+----+---------+----------------+------+
| 4  | 1       | 1              | xxx  |
+----+---------+----------------+------+   
| 2  | 1       | 2              | xxx  |
+----+---------+----------------+------+
| 6  | 1       | 2              | xxx  |
+----+---------+----------------+------+
| 1  | 1       | 3              | xxx  |
+----+---------+----------------+------+

This approach will be slow when there are 100s of versions for every Model and thousands of records in the Template table. So there is another approach we are considering.

Adding FromDateTime and ToDateTime in child tables.

Model Table

+----+--------+-----------+-----------------------|
| ID | Name   | VersionID | CreatedDateTime       |
+----+--------+-----------+-----------------------+
| 1  | First  | 1         |    ......             |
+----+--------+-----------+-----------------------+
| .  | .....  | ....      |    ......             |
+----+--------+-----------+-----------------------+
| 1  | First  | 3         |2018-04-18 15:07:28.920|
+----+--------+-----------+-----------------------+

Template Table

+----+---------+----------------+------+--------------+------------+
| ID | ModelID | ModelVersionID | Data | FromDateTime | ToDateTime |
+----+---------+----------------+------+--------------+------------+
| 1  | 1       | 1              | xxx  |              |            |
+----+---------+----------------+------+--------------+------------+
| .  | .....   | .....          | ...  |   ......     |   ......   |
+----+---------+----------------+------+--------------+------------+

ToDateTime is NULL for the current(latest) record.

So, if I want to load the latest templates for the latest model version. It is easy, just load the Templates for ModelId 1 where ToDateTime is NULL.

When I want to load templates for an old Model version. The query will be

FromDateTime  >= CreatedDateTime AND ToDateTime   <= CreatedDateTime

If we index FromDateTime and ToDateTime, above query should be much fast than the previous case. Before going ahead with that direction, I am wondering if there are other common patterns to solve this type of versioning?

Best Answer

For your date ranges, always use the max collating value (typically 9999-12-31) for the end time to represent "current" or "until further notice". This means that your queries for finding a record that is current at a particular point in time (including GETDATE()) will be consistent.

See also my answer to a similar question on Stack Overflow, with additional details and tips.