Best approach for a one-to-many list – how to store/acces some related count info

best practices

I have a table with persons, and other with interactions, notes and tasks, all related to person(s).
I need to have quick access to:

  • last interaction date.
  • number of interactions in the past 1,6, 12 months.
  • number of notes (Total)
  • number of tasks

All this info are used for sorting highlighting in a list/table. When user select a person, all related info can be requested from DB, question is how to get this data for all persons from db?

Version 1:
Sub-query and add the info for each person in table.
Could be intensive when tables grows.

Version 2: store this info in person table, update them when new info is created/modified.
This violates the 'single source of true' principle and could be prone to errors.

Any ideas?

Best Answer

Without specifics such as which database system, version, and implementation (when applicable), your table schema, how much data there is, anticipated will be, and general growth pattern, it's hard to recommend one option over the other.

It looks like you're trying to do standard types of OLAP querying against your data in which case both options are valid (to a degree). Most modern RDBMS can handle OLAP type queries for Tables with rows into the multi-millions without a blink of an eye. So option 1 can be sufficient of a solution up to a certain point, as long as proper architecting and indexing is adhered to. For example (and this is where it's important to know specifics like which database system you're on), Microsoft SQL Server offers columnstore indexing which can immensely improve the performance of an OLAP type query by efficiently compressing the data in a columnar format which helps improve aggregation queries. YMMV on these kinds of features depending on which database system you're using.

Alternatively, when you get to a point where the data is just too much for your regular OLTP architected database to handle, then pre-storing data rollups and the answers to your other OLAP type queries in data cubes (loosely using the term here) makes sense and is accomplished by either a real-time process (such as via triggers) or periodically calculated at a rate acceptable to the business needs, and is considered the one source of truth for whatever state it's in at any point of time. But I would still follow some level of normalization and not store this information directly on the person table, rather stored on a table that references the person table (by storing the person_id for example).