Sql-server – SSIS / SSAS: Handling Age Columns

nullsql serverssasssis

I have some columns representing Age in a table, taking values from 0 to 100, but also -1 for 'Missing Value'. I don't want to store them as strings to be able to find average age etc., but SSAS turns 0 into NULL by default, and I also want to replace -1 with NULL, so there is the classic "duplicate attribute key" error.

Is there a proper way / type to handle age values?

Best Answer

You are confusing 2 things in your question. Since you state you have duplicate attribute key errors you are using the the age table as a dimension, but since you also want to find averages you are using the age table as a fact table.

When using ages as a dimension you probably don't want every single age on your axis, so what is typically done is divide the ages in buckets and have the buckets stored as strings so you get something like this

+-------+--------+
|  Age  |  sales |
+-------+--------+
| <10   |      0 |
| 10-20 |    100 |
| 21-30 |    250 |
| 31-40 |    124 |
| 41-50 |   3598 |
| > 50  |    715 |
+-------+--------+

You could either use discretization buckets in your dimension or create the buckets manually in your dimension table.

However if you want to average ages too, I suppose you want to see the average across some dimensions (such as average age of people ordering product X) so you need to create a fact table with the keys of the dimensions you want to report across and then define relationships between your fact table and dimension tables

Mixing the 2 concepts (dimensions and facts) in the same table will lead to headaches over time so you should either solve that in your ETL process or using named queries in your dmv in SSAS.