We are building a tool to track the prices of products over time, and using Postgres as our RDBMS. It is important that product attributes can be changed, and that the history of an product's attributes be preserved forever. Here is a schema we designed based on OpenStreetMap's internal schema:
We have a 'products' table on the left storing every version of every product, and a 'current_products' table on the right storing only the most recent version of each product. Every time we want to change a store, we:
- create an entry in changesets
- read the latest entry of the product in 'products', increment version by one, and create another entry with the changes
- delete the corresponding entry in 'current_products' and create a new one with the changes and the latest version number from 'products'
We want to enforce as many business rules in the database engine as possible rather than relying on our software to keep things consistent, and this schema feels pretty "off", so we welcome any suggestions. Thanks in advance!
Edit:
Revised the schema based a response from @bbaird . Also decided to include versioning of stores and users. Tied products, stores, and users together with price table.
Best Answer
You're right to feel the schema is off, because it is - the way it is designed now will not guarantee the minimum criteria required for consistency: As of a point in time, only one value can exist for a given attribute.
There are two ways to handle this, depending on the use case:
Solution: Case 1
You would have a
Product
table and aProduct_Version
to store the necessary information. You will need a view/function to return the proper value.Since you are dealing with food (and a standard source), I'm going to make certain assumptions about keys/datatypes. Feel free to comment to clarify.
To get the values for a specific product as of a point in time, you would use the following query:
You can also make a view to mimic the function of a table with static values:
For one-to-many relationships (let's use
Ingredient
for this example) you would follow a pattern like so:Then to get a list of
Ingredients
for aProduct
at a point in time, you would use the following query:Similar to the prior example, you can create a view to provide the current values for each of the one-to-many relations.
Solution: Case 2
If you merely need to store history, you simply make a small modification to the structure:
In this case, whenever an update or delete is called for a
Product
, the following operations are followed:Product
Product
table with the new valuesNotes:
Price
), but others do not (Name
,Description
), you can always split things into more tables (Product_Price
,Product_Name
, etc.) and just create a view that incorporates all of those elements. This level of effort generally isn't necessary unless the entities have a lot of attributes or you will have a lot of ad-hoc queries that are asking time-specific questions that rely on knowing the prior value was actually different, such as "Which products increased price during this time frame?"Id
on every table and thinking that provides any sort of value. Time-variant data always requires composite keys and only returns consistent results if the data is properly normalized to at least 3NF. Do not use any sort of ORM that does not support composite keys.