SQL Server – Valid To/From Columns on a Fact Table in Data Warehousing

data-warehousedatabase-designsql serverstar-schema

I recently came across "valid from" and "valid to" columns on a fact table. Obviously this is common for dimensions but I've not seen them on fact tables before and can't find any info suggesting they're widely used.

The table granularity is analogous with bank account transactions, so a new fact is generated when a deposit is made, or an amount withdrawn. The reason given for inclusion of the columns is the need for consistent reporting (if we run this report today it must be the same as it was a week ago) combined with poor quality of source systems/data; users are able to go into the source system and say "this deposit was actually only £10, not £100". When this happens a second fact row is inserted and the original row is expired.

In my mind, a new fact row should be inserted to reverse the original in order to maintain the history (applying -£100 in the example) and the updated fact should be inserted (+£10). It feels like working with the valid to/from columns introduces too much complexity for users when reporting as well as the risk of error (summing up both active and expired facts).

Does anyone have experience of this and are there any references which specifically cover it (blog posts, articles or even books)?

Best Answer

The validity depends on how users want to look at the data. You are looking at it as just a transaction fact. Other types of fact tables include periodic snapshots and accumulating snapshots. If you want to see all the times that someone corrected a row to help decrease erroneous entries, the effective dates may be appropriate so it's clear that the transaction was updated. This creates a fact table that is somewhat similar to a type 2 SCD.

The Kimball Group has an article that directly addresses your question.

Here's a Kimball Group design tip that talks about effective dated accumulating snpashot fact tables.

You may be correct that you should just add transactions that reverse the original row. That sounds like it could be a valid solution if you just need to see the transactions and sum them up. This is how I've seen most accounting data work. Kimball says the effective dated fact tables may be useful for quickly calculating account balances at a paritcular point in time, especially for tracking slowly changing balances. But this is a fairly rare case. I think your concern that it will be confusing for users is also valid. You have to decide if that can be overcome by education and if its worth it for the added analytical capabilites in the data.

I haven't had to do this much in my data warehousing experience because most of my facts were simple transactions or periodic snapshots. But I have created several effective dated bridge tables for many-to-many relationships.