How to Model Multiple FACTs Schema – Best Practices

database-designdimensional-modelingfacttableslowly-changing-dimensionstar-schema

Background

I'm in a process of designing a database (using a STAR schema).

There are three tables to model: products, tests, states.

The database will be used to store results of tests conducted on products (in a great simplification). There can be many tests pointing out to a single product but each test is unique (they are not shared among products). Besides, I need to record the current state of the product, at the time when the test was conducted. Let's assume that the state of a product describes its current location and owner, which are changing very often. That will most likely involve SCD lvl 2 – to track the history of state changes and to be able to locate a product with all its tests as well as the states it had during these tests.

Problem

I'm not entirely sure how to model this problem. It seems obvious to store every test in a FACT table. This table would then consist of thousands of transactions. On the other hand, there will also be hundreds (and later thousands) of products, so I should probably keep them in a second FACT table. Then, there will also be thousands of state changes, so in order to record their entire history, I would need to keep them in a … FACT table as well? I've been told that FACT tables are typically used to store multiple-rows data but on the other hand where are the DIMs in this model?

I also don't know how to model the relationships between these tables. Products – states is a 1:* relationship. Products – tests is a 1:* as well. Finally, states – tests is a 1:* too. I would then go with linking products to states and then states to tests (products 1<-* states 1<-* tests), what would allow me to find all states for a particular product and all tests (in all states or in a selected state). What do you think about that? The problem here is that, as I keep adding states, I have two options: either keep duplicating products in the products table (with added "recorded_timestamp" column) or use a SCD lvl 2 in states table, pointing out to the products table with a FK, but this would effectively make the product table a DIM!

Any help here would be very appreciated.

Best Answer

Firstly, neither "star" nor "fact" are abbreviations, as isn't "dimension", so they should not be in all caps. "Fact tables" are called such because they contain facts. A "star schema" resembles (to some people) a star when represented graphically, in case there are many dimensions describing a single fact.

A product cannot be a fact, so it is a dimension. A state may or may not be a fact, so you should look harder at this entity to determine what it really is. A test is certainly a fact.

Depending on what you decide a state to be, your model would be either a two-point "star", with product and state as dimensions, or a one-legged snowflake. You could also find that the combination of state and test is indeed a fact, then you'll end up with a one-point "star".

Related Solutions

Database schema for a product with multiple categories and hierarchical categories

What you are proposing is a good solution for your requirement of M:N products to categories and hierarchical categories.

To avoid exposing yourself to numerous updates: You need to do two things to ensure that you don't have a lot of updates in your intersection table.

First, you need to be sure that your categories have a stable, persistent primary key.

Second, you need to link food items to leaf categories. Don't join Cherry to Red, Healthy, Fruit and Food - just join it to Red and Healthy. Your nested sets take care of all of the secondary (and higher level) associations.

Mysql – Database Design (Stores + Products)

Your problem can be modelled by an entity relationship diagram (created with dia):

ER-diagram .

There are two entities "product" and "store" and there is an m-n-relationship "stocks" between these entities. "product" has an attribute "productId" that uniquely determines the product. "store" has the unique attribute "storeId" that determines uniquely the store.

I have added the attributes "quanity", "storeAddress" and "productName" to make the model a little bit more complex. These attributes cannot be derived by your description, you can remove it from the diagram and from the resulting tables if you want the design follow your description closely. An n-m-realtionship means that a store can stock different products and a product can be in stock in different stores. We have the following problem desription:

Each store has a unique Id and an address
Each product has a unique Id an a name
a store can stock different products with different quantities

The last sentence can also be formulated in this way

a product can be stocked in different stores with different quantities

Such a diagram can be transformed in a relational model. Each entity is a table where the attributes are the columns. The unique identifier is the primary key. The m-n-relation is a table where the primary key is the a (storeId,productId) pair. If the relation has an attribute this attribute is a column in the table.

so we have the following relations (tables)

store("storeId", storeAddress)
product("productId",productName)
stock("storeId","productId",quantity)

Best Answer

Related Solutions

Database schema for a product with multiple categories and hierarchical categories

Mysql – Database Design (Stores + Products)

Related Question