Database Design – Splitting Auditable Table Based on Update Frequency

auditdatabase-design

I'm currently designing a database schema for a small microservice which handles products.

The service contains a simple REST API which allows users to perform basic management of products and an another API which is used to perform operations which consume these products. The maximum amount of consumes per product is limited with an explicitly defined consumables count. When this limit is reached, the product cannot be consumed anymore. Each product typically contains about 1,000 – 1,000,000 consumables, while a single operation may take 1 – 10 consumables at a time.

The frequency of the management operations is quite low, but the rate of the consuming operations is very high.

At the moment, the schema puts all product information into a single table. This table also contains information which must be audited into a snapshot based history table on each INSERT, UPDATE and DELETE operation. This is currently done with DB triggers, which know how to add rows into the history table along with the additional audit information. Each consumable use also adds a new row into consumables table with some additional information related to how it was consumed.

Here's a simplified example about how these tables are currently organized.

------------------------
| product              |
------------------------
| + id                 |
| + name               |
| + is_enabled         |
| + consumable_limit   |
| + consumable_counter |
| ...10 omitted fields |
------------------------

------------------------
| product_history      |
------------------------
| + id                 |
| + product_id         |
| + time               |
| + user_id            |
| + history_type       |
| + name               |
| + is_enabled         |
| + consumable_limit   |
| + consumable_counter |
| ...10 omitted fields |
------------------------

------------------------
| operations           |
------------------------
| + id                 |
| + product_id         |
| + consumable_idx     |
| ...10 omitted fields |
------------------------

In the above diagram, I've omitted some fields to keep things more simple. The thing that keeps me wondering here is that the consumable_counter in the product table gets rapidly updated with consuming operations, which may further flood the product_history table. Would it be better move the consumable counter into an another table where it would not trigger the a full snapshot of the product table row each time the consumable_counter is being updated? Somehow, I feel like it could be ideal as the consumable_counter is the only column in the product table which is being updated with a high rate. A full snapshot on each counter increment somehow feels a bit overkill.

Edit

The consumable_counter acts as a counter of how many times the corresponding product has been used. When it reaches the consumable_limit, the product cannot be consumed any more.

Edit

Each time a product is being consumed, a new entry will be added into the operations table. Entries in this table also act as business logic transactions containing a state which will be managed during the operation execution flow.

Best Answer

Short answer:

Yes, it is perfectly acceptable to split the audit tables into two subject areas if one attribute is updated very frequently, and the other attribute(s) are updated infrequently. But this has more to do with lowering write overhead and possibly making changes of certain attributes easier to find, not good or bad design.

Long answer:

There are reasons banks and other institutions do not store a static amount for balance/quantity. Namely:

  1. The data is easily derived from transactional data, in a high-volume environment this eliminates an additional update.
  2. Depending on the design pattern used, it is possible, if not probable the transactional data will disagree with the static value. If you are charging money for your products, or otherwise subject to audit, this is unacceptable.

See these two answers from SO which provide more insight and guidance than I can provide:

https://stackoverflow.com/a/29713230/13942986

https://stackoverflow.com/a/59465148/13942986

Additional Observations

  1. The Id on your product_history and operations tables are row pointers, not keys, and are entirely unnecessary. The primary key in both situations should be along the lines of (product_id, update_datetime) or (product_id, transaction_datetime).
  2. Context matters. Rename Id on product to product_id.
  3. Consistency matters. Use singular names, so operation, not operations.