Mysql – Saving daily data for each user – MySQL

database-designMySQL

I'm currently thinking about how to store data that the user might (or might not) change each day. Please note, I'm not a rdb expert and I'm still trying to learn everything.

For example, the user may change his weight today, (2021-04-24) to 80kg then tomorrow change it to 80.25kg, and so on (its also possible to go back in time inside the app and change the weight a month ago for example)

What I'm struggling with is how to save the data in a scalable way.
I know that I might want to add different data_types, for example, the option to change the daily waist measurement.

All options are assuming that the app have about 10K daily active users (not a lot but its just for demonstration sake)

Option 1: Save each day as one row, if I'll need to add new data_type, I'll have to do a ALTER TABLE ADD COLUMN `waist` . Which I know will lock the table and its probably not a great idea in production (or am I wrong?)

CREATE TABLE `daily_data` (
  `date` date NOT NULL,
  `user_id` int NOT NULL,
  `weight` float NOT NULL,
  `created_at` TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
  `updated_at` TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  PRIMARY KEY (`date`, `user_id`)
);

Number of rows per year: 3,650,000.

Option 2:
Split the table in to 2 tables, So now I can easily add a new data_type in production by just adding a new row in data_types

CREATE TABLE `data_types` (
  `type_id` int PRIMARY KEY NOT NULL AUTO_INCREMENT,
  `name` varchar(20) NOT NULL UNIQUE,
  `created_at` TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
  `updated_at` TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
);

CREATE TABLE `daily_data` (
  `date` date NOT NULL,
  `user_id` int NOT NULL,
  `type_id` int NOT NULL,
  `data` float NOT NULL,
  `created_at` TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
  `updated_at` TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  PRIMARY KEY (`date`, `user_id`, `type_id`)
);

The number of rows per year increases significantly (10,000 * number_of_types * 365).

Lets say that there are 3 types: 10,950,000.

This makes me think if its a good solution because 10K daily active users is not a lot of users but the number of rows is very big.

Option 3:
A bit more efficient because now instead of inserting new row per user per day, I'll insert a new row only when the user actually changes the data.

 CREATE TABLE `daily_data` (
   `id` int PRIMARY KEY AUTO_INCREMENT,
   `user_id` int NOT NULL,
   `start_date` date NOT NULL,
   `end_date` date,
   `type_id` int NOT NULL,
   `data` float NOT NULL,
);

With this solution there are a lot less rows because now potential duplicated data will not occur.
Minimum number of rows: (10,000 * 3) – None of the users ever changed their data

Maximum number of rows: 10,000 * 3 * 365 – Each user changed each data_type one day a year.

  1. What's the best solution?

  2. Is it reasonable to have this many rows for 10K daily active users? what if the app grows to 100K users? is it still fine?

  3. Is there a better solution I'm missing?

Note: the third option refers to Option no.2 schema but can obviously be applied to Option no.1 as well.

My main goal is to be able to retrieve data as quickly as possible for my app.

Any kind of help is much appreciated!! if something is unclear, please comment and I'll clarify.

Best Answer

Definitely save new day data only if something has changed.

Now the question is onr table with data , or types and data.

Depends on number of columns and the periodicity of change.

For two table, there is a higher overhead, but iwth single wide table (lets say 10 or more types or data), you will be storing lot of duplicate data. In addition, more joins will be needed. I would say single table is better if storage is not an issue