The validity depends on how users want to look at the data. You are looking at it as just a transaction fact. Other types of fact tables include periodic snapshots and accumulating snapshots. If you want to see all the times that someone corrected a row to help decrease erroneous entries, the effective dates may be appropriate so it's clear that the transaction was updated. This creates a fact table that is somewhat similar to a type 2 SCD.
The Kimball Group has an article that directly addresses your question.
Here's a Kimball Group design tip that talks about effective dated accumulating snpashot fact tables.
You may be correct that you should just add transactions that reverse the original row. That sounds like it could be a valid solution if you just need to see the transactions and sum them up. This is how I've seen most accounting data work. Kimball says the effective dated fact tables may be useful for quickly calculating account balances at a paritcular point in time, especially for tracking slowly changing balances. But this is a fairly rare case. I think your concern that it will be confusing for users is also valid. You have to decide if that can be overcome by education and if its worth it for the added analytical capabilites in the data.
I haven't had to do this much in my data warehousing experience because most of my facts were simple transactions or periodic snapshots. But I have created several effective dated bridge tables for many-to-many relationships.
I believe the guidance is referring to a wide fact table where the majority of measure values are null:
CREATE TABLE dbo.SparseFact
(
Dim1Key INT NOT NULL
, Dim2Key INT NOT NULL
, Dim3Key INT NOT NULL
, Dim4Key INT NOT NULL
, Dim5Key INT NOT NULL
, Value1 INT NULL
, Value2 INT NULL
, Value3 INT NULL
, Value4 INT NULL
, Value5 INT NULL
, Value6 INT NULL
, Value7 INT NULL
, Value8 INT NULL
..
, Value101 INT NULL
, Value102 INT NULL
, Value103 INT NULL
);
The suggestion is that some people will see all the nulls and decide to do this instead:
CREATE TABLE dbo.DontDoThisFact
(
Dim1Key INT NOT NULL
, Dim2Key INT NOT NULL
, Dim3Key INT NOT NULL
, Dim4Key INT NOT NULL
, Dim5Key INT NOT NULL
, MeasureTypeKey INT NOT NULL
, Value INT NOT NULL
);
Not good.
In your scenario I think I'd be looking at something like this, which is very different to the scenario described in the articles you referenced.
CREATE TABLE dbo.InventoryFact
(
ContainerKey INT NOT NULL
, TransportTypeKey TINYINT NOT NULL
, EntryDateTime DATETIME NULL
, ExitDateTime DATETIME NULL
);
CREATE TABLE dbo.TransportType
(
TransportTypeKey TINYINT IDENTITY(1,1) NOT NULL
, EntryTransport CHAR(10) NOT NULL
, ExitTransport CHAR(10) NOT NULL
);
INSERT
dbo.TransportType
SELECT
EntryTransport
, ExitTransport
FROM
(
SELECT EntryTransport = 'Train'
UNION
SELECT EntryTransport = 'Truck'
UNION
SELECT EntryTransport = 'Vessel'
UNION
SELECT EntryTransport = 'N/A'
UNION
SELECT EntryTransport = 'Unknown'
) en
CROSS JOIN
(
SELECT ExitTransport = 'Train'
UNION
SELECT ExitTransport = 'Truck'
UNION
SELECT ExitTransport = 'Vessel'
UNION
SELECT ExitTransport = 'N/A'
UNION
SELECT ExitTransport = 'Unknown'
) ex;
To the additional questions...
I would add ExpectedEntryDate
, ExpectedExitDate
to the Container/InventoryFact
. Less certain, without visibility of all the data elements, I would probably put EntryVoyageId
and ExitVoyageId
in a separate junk dimension together as one row along with any other degenerate data items (identifiers for the truck, train etc).
I would add 3 new dimensions for VesselVoyage
, TruckVoyage
and TrainVoyage
and 6 Voyage keys (inbound/outbound) to this one fact (it's 6 new keys, not 6 additional rows). You then have the option of placing Dock
and Tollbooth
in the appropriate Voyage dimension. If you keep the generic data in these dimensions (VesselFlag
, TruckCapacity
) and the specific in a junk dimension (VesselName
, VesselMMSI
) they won't explode in size.
Best Answer
I will typically have my business key in the fact table so I can easily track back to the source system for any questions. I usually will put a unique constraint on it to ensure the granularity of the fact table is the same as the business key.