I believe the guidance is referring to a wide fact table where the majority of measure values are null:
CREATE TABLE dbo.SparseFact
(
Dim1Key INT NOT NULL
, Dim2Key INT NOT NULL
, Dim3Key INT NOT NULL
, Dim4Key INT NOT NULL
, Dim5Key INT NOT NULL
, Value1 INT NULL
, Value2 INT NULL
, Value3 INT NULL
, Value4 INT NULL
, Value5 INT NULL
, Value6 INT NULL
, Value7 INT NULL
, Value8 INT NULL
..
, Value101 INT NULL
, Value102 INT NULL
, Value103 INT NULL
);
The suggestion is that some people will see all the nulls and decide to do this instead:
CREATE TABLE dbo.DontDoThisFact
(
Dim1Key INT NOT NULL
, Dim2Key INT NOT NULL
, Dim3Key INT NOT NULL
, Dim4Key INT NOT NULL
, Dim5Key INT NOT NULL
, MeasureTypeKey INT NOT NULL
, Value INT NOT NULL
);
Not good.
In your scenario I think I'd be looking at something like this, which is very different to the scenario described in the articles you referenced.
CREATE TABLE dbo.InventoryFact
(
ContainerKey INT NOT NULL
, TransportTypeKey TINYINT NOT NULL
, EntryDateTime DATETIME NULL
, ExitDateTime DATETIME NULL
);
CREATE TABLE dbo.TransportType
(
TransportTypeKey TINYINT IDENTITY(1,1) NOT NULL
, EntryTransport CHAR(10) NOT NULL
, ExitTransport CHAR(10) NOT NULL
);
INSERT
dbo.TransportType
SELECT
EntryTransport
, ExitTransport
FROM
(
SELECT EntryTransport = 'Train'
UNION
SELECT EntryTransport = 'Truck'
UNION
SELECT EntryTransport = 'Vessel'
UNION
SELECT EntryTransport = 'N/A'
UNION
SELECT EntryTransport = 'Unknown'
) en
CROSS JOIN
(
SELECT ExitTransport = 'Train'
UNION
SELECT ExitTransport = 'Truck'
UNION
SELECT ExitTransport = 'Vessel'
UNION
SELECT ExitTransport = 'N/A'
UNION
SELECT ExitTransport = 'Unknown'
) ex;
To the additional questions...
I would add ExpectedEntryDate
, ExpectedExitDate
to the Container/InventoryFact
. Less certain, without visibility of all the data elements, I would probably put EntryVoyageId
and ExitVoyageId
in a separate junk dimension together as one row along with any other degenerate data items (identifiers for the truck, train etc).
I would add 3 new dimensions for VesselVoyage
, TruckVoyage
and TrainVoyage
and 6 Voyage keys (inbound/outbound) to this one fact (it's 6 new keys, not 6 additional rows). You then have the option of placing Dock
and Tollbooth
in the appropriate Voyage dimension. If you keep the generic data in these dimensions (VesselFlag
, TruckCapacity
) and the specific in a junk dimension (VesselName
, VesselMMSI
) they won't explode in size.
Best Answer
Blame Kimball for the confusing term. Attendance is a fact (as in, "a thing that is known or proved to be true"). However, it is an event, and as such it does not have any extra information apart from references to the corresponding dimension keys. It probably should be called "measureless fact".
A sale, in addition to the dimension references, has measures: quantity and price.