I'm setting up my first dimensional database with SSAS, and I have this [Materials] dimension that needs a hierarchy that goes something like this:
[PriceCode v] --> nullable
Price Code
...
[Material v]
Code
AltCode
Name
...
[Id v] --> not actually exposed as a hierarchy level
DateInserted
DateUpdated
DateDeleted
EffectiveFrom
EffectiveTo
The problem is that the [PriceCode]
attribute is nullable; the DSV has a FK between a [Materials]
and a [PriceCodes]
table, and [Materials].[PriceCodeId]
is nullable.
Is there a way to still define a hierarchy where a nullable attribute is a parent? I've fiddled with UnkownMember and UnknownMemberName and the attribute key's NullProcessing setting, but couldn't get the dimension to process.
Bonus points if someone could confirm whether I'm approaching the slowly-changing dimension problem correctly, by making a hierarchy level based on business keys (i.e. the Code
field; the natural key includes the EffectiveTo
field, which is null
for current image of a record), and treating the SCD metadata as a level of its own.
Best Answer
You actually have 2 questions in one question. If you create a new question for the attributes it would be neater and I'll cut/paste half of this as an answer there :)
Nullable Parent Level
You probably don't want
NULL
s in your OLAP dimensions, and Kimball seems to agree.It kind of depends if you have an
ETL
process and a Data Warehouse or not how you should be handling them, but there are different types of 'not found'.Think about the difference in a foreign key, one has an empty field, another has a field that's filled but the related record can't (or no longer) be found. I like to diffentiate between
BLANK
andDATA ERROR
in my dimension.In your example you could differentiate between 'no price code' and 'a price code I can no longer find'
If you have an
ETL
process with a Data Warehouse you can handle that easily in yourETL
process, if you don't you would need some case statements in your DSV queries.This question seems to reveal issues with the underlying Data Warehouse. There are arguments for and against both star and snowflake schema's but personally I tend towards a star schema, with some snowflake mixed in when necessary.
In any case the data cleansing and missing links needs to be solved in your Data Warehouse long before you reach the dsv.
Slowly Changing Dimension attributes
With regards to your
Slowly Changing Dimension
I don't see how the data type of hierarchies or keys in your dimension would change because the dimension somehow isSCD
, that doesn't matter at all. You just need a validity rule somewhere in your ETL that gets picked up by your SSAS dimension definition (See here). But for anydimension key
you create I suggest you use a surrogate key mostly because your surrogate key can be anint
orbigint
instead of a varchar and that could massively improve performance even for attribute keys.Off course that numeric key would be a representation of 'the attribute' and not necessarily include the validity fields. The validity of the record is specified at the record in your dimension table, but as you state isn't necessary for your attribute keys.
For example this could be your dimension data
Where you can pick dimension_key for the key of your
key attribute
and you could pick either name or name_key as the key of yourname
attribute.Determining if it's worth the hassle for
name
depends on how many members your attribute will have (and your key attribute typically has most members).In the end there isn't really any relation between the fact that you have a
SCD
and your decision what key is a good pick for your attribute. End user requirements make that decision for you. In the example dimension you would want all sales by mat reported under mat, and not have 2 mat's in your members when users report on that.