Is dimensional modelling useful for modelling tables of calculations

data-warehousedatabase-design

I have a set of tables that look a bit like this:

Key 1 | Key 2 | Key 3 | Var 1 | Var 2 | Var 3 | ... | Var 200
-------------------------------------------------------------
   X  |   Y   |   Z   |   2   |   3   |   5   | ... |    8

The first three columns form a compound key. The remaining columns are populated by a series of calculations such as:

Var 3 = Var 1 + Var 2
Var 4 = Var 3 + Var 2
Var 5 = Var 3 + Var 4
....

(This is a huge simplification of the formulae. But, in essence, there is a chain of calculations that builds on previously calculated items.)

So, when considering how to build a dimensional model, I can identify the components of the key as a dimension (or two), and the variables seem to be measures, but I am struggling to identify what the fact is. I wonder if there is a key insight into the dimensional modelling process that I'm missing, or whether it's just inappropriate for my situation.

EDIT: I am not certain what I would achieve by dimensional modelling either! The context is an existing application which processes 3NF tables, does calculations, and spits out 3NF tables. It is proposed that these tables are migrated into a data warehouse which is being built using the dimensional modelling methodology. I am struggling to understand how this might be of benefit, since the calculations are abstract mathematical and statistical computations, while dimensional modelling seems to be oriented around identifying the "business process" that gives rise to facts.

Best Answer

If you could meaningfully slice the data by key1, key2 or key3 then you could make dimensions out of them. In that case you would have separate dimension tables that had all the values of each key, and any relevant attributes (even just a human-readable description of what the value represents) or groupings (roll-ups of the key values).

Each dimension table would have the key value and its attributes, and would link to the fact table on its key.

If your values (Var X) are additive across rows (i.e. all the values added together for Key1='Foo' mean something) then you can use the keys to slice the data.

I don't think that relationships between the Var columns are really relevant to dimensional modelling unless you want to unpivot them and indentify them by a 'variable type' dimension rather than separate columns

Beyond that, the question is a bit vague to really get a clear view of what you are trying to achieve and why you want to make a dimensional model of your data in the first place. Maybe if you can clarify your goals a bit we can give you a more meaningful answer.

EDIT: Dimensional modelling is just identifying the axes by which you want to slice data in aggregate. Sometimes a dimension might be very simple - just a code. Sometimes it may be complex with lots of sub-attributes - such as an insurance policy. A dimension lets you slice data in aggregate, either by its key or by rolling up the data by some attribute of the dimension.

If the VarX columns are additive between rows - i.e it makes sense to aggregate (for example) var2 for two or more rows - then a dimensional structure might make sense for reporting. One key point of a dimensional model is that it tends to facilitate efficient queries in aggregate, and plays nicely with OLAP tools.

So, if I had values 'Foo', 'Bar', 'Wibble', 'Blarg' for key1 and values 'A', 'B', 'C' for key2 then I might have a few rows that looked a bit like:

Key1      Key2       Var1       [. . .]
--------- ---------- ---------- ----------
Foo       A          10         . . .
Bar       A          20         . . .
Foo       B          50         . . .

If I wanted to see the total of Var1 for all Key2='A' then the reporting tool would allow me to slice on Key2 and add up all the rows. If I wanted to see the total of Var1 for Key1='Foo' then I could do something similar. In this instance the total of Var1 for Key2 = 'A' would be 30.

If this is the sort of thing you want to do the for data then a dimensional model could help, and would allow you to put up a statistical reporting facility with an off-the-shelf reporting tool such as Report Builder.

A dimension table for Key1 might look like

Key1    Key1Desc                 Key1Grouping
------- ------------------------ -------------
Foo     Description of 'Foo'     Group A
Bar     Description of 'Bar'     Group A
Wibble  Description of 'Wibble'  Group A
Blarg   Description of 'Blarg'   Group B

By joining this dimension table to the fact table you can then roll up by Key1Grouping and see stats for (say) 'Group A.' You can calculate ratios of Vars after the base additive measures have been aggregated.

Again - I'm still a bit unsure of quite what you want to know.