How to Model Multiple FACTs Schema – Best Practices

database-designdimensional-modelingfacttableslowly-changing-dimensionstar-schema

Background

I'm in a process of designing a database (using a STAR schema).

There are three tables to model: products, tests, states.

The database will be used to store results of tests conducted on products (in a great simplification). There can be many tests pointing out to a single product but each test is unique (they are not shared among products). Besides, I need to record the current state of the product, at the time when the test was conducted. Let's assume that the state of a product describes its current location and owner, which are changing very often. That will most likely involve SCD lvl 2 – to track the history of state changes and to be able to locate a product with all its tests as well as the states it had during these tests.

Problem

I'm not entirely sure how to model this problem. It seems obvious to store every test in a FACT table. This table would then consist of thousands of transactions. On the other hand, there will also be hundreds (and later thousands) of products, so I should probably keep them in a second FACT table. Then, there will also be thousands of state changes, so in order to record their entire history, I would need to keep them in a … FACT table as well? I've been told that FACT tables are typically used to store multiple-rows data but on the other hand where are the DIMs in this model?

I also don't know how to model the relationships between these tables. Productsstates is a 1:* relationship. Productstests is a 1:* as well. Finally, statestests is a 1:* too. I would then go with linking products to states and then states to tests (products 1<-* states 1<-* tests), what would allow me to find all states for a particular product and all tests (in all states or in a selected state). What do you think about that? The problem here is that, as I keep adding states, I have two options: either keep duplicating products in the products table (with added "recorded_timestamp" column) or use a SCD lvl 2 in states table, pointing out to the products table with a FK, but this would effectively make the product table a DIM!

Any help here would be very appreciated.

Best Answer

Firstly, neither "star" nor "fact" are abbreviations, as isn't "dimension", so they should not be in all caps. "Fact tables" are called such because they contain facts. A "star schema" resembles (to some people) a star when represented graphically, in case there are many dimensions describing a single fact.

A product cannot be a fact, so it is a dimension. A state may or may not be a fact, so you should look harder at this entity to determine what it really is. A test is certainly a fact.

Depending on what you decide a state to be, your model would be either a two-point "star", with product and state as dimensions, or a one-legged snowflake. You could also find that the combination of state and test is indeed a fact, then you'll end up with a one-point "star".