MariaDB – Database Normalization Causing Inefficiency

mariadbperformance

Note: I am relatively new to managing an SQL database. I have done my fair share of reading, research, and experimentation, but it is possible that I am missing some fairly basic knowledge. I'm also not sure how to tag this question so if someone could add relevant tags that would be appreciated.


I am currently building an SQL database for a forum like website. Among other things, the website will contain pages with threads, basically a series of posts by different users.

Users on the website are identified by a unique ID (char(36)) that is set once at registration and cannot be changed afterward. The user also has a username (varchar(16)) that is just a screen name. This name is almost always unique between users, but not necessarily. The user can also change the name, but there is a hard limit of two weeks between any two username changes. All of this information, and a few other things like login details, are stored in one Users table with the ID as a primary key.

There are multiple tables for the forum itself, but the one that is important for this question is the Posts table (primary key is 'post_id' int(10) NOT NUll AUTO_INCREMENT PRIMARY KEY. It contains one row for every time somebody makes a post on a thread, including responses to existing posts. As a result, whenever a user requests a thread, I query the posts table for all posts matching that thread (I store a thread ID foreign key). My question now concerns storing a user who is the author of the post. When a thread is requested, I need to send the username (not ID) of the users that made different posts. However, I also need to make sure that changes in usernames do not cause the apparent author of a post to change. As far as I can tell I have two possibilities:

  1. For storing the author of a post, I can store only the ID. However, this means that I will have to make quite a few requests to the Users table for the usernames of the various IDs of the posters. This seems highly inefficient, seeing that the usernames will change very rarely.

  2. I can store both the author ID and the author username in the Posts table. I can then go through the table and change all usernames among the posts (I can even set this up with a trigger, I believe). However, this opens the window for update anomalies, if the username stored for an ID in the Users table is different than in one of the posts. I can still use a trigger to prevent these anomalies at the database level, but from what I understand this should not be necessary in a well-designed database.

Which one of these two solutions (or something entirely different) should I use to both maintain efficiency and database normalization?

If it matters, I'm using the most recent version of MariaDB and accessing using PHP.

Best Answer

I'm using the most recent version of MariaDB [...] 10.0.31-MariaDB-0ubuntu0.16.04.2

That's not at all the "most recent version". MariaDB 10.0 was released on 12 Nov 2012. That's not even the most recent security release which goes to 10.0.33 released on 30 Oct 2017. The version you're on has been frozen for over 5 years (no development except fixes just handling "security issues and bugs")

I am currently building an SQL database for a forum like website. Among other things, the website will contain pages with threads, basically a series of posts by different users.

In order to build threads, you really need Recursive CTEs. This is a hierarchical question and you can't model hierarchy relationally without recursion. MariaDB added Recursive CTEs in 10.2.2, that went stable in 2017-05-23 with 10.2.6.

With that you could get the job done, but if you haven't started the job, give PostgreSQL a shot. I did this very task here. You can just take that code and your done, and if you migrate to PostgreSQL you get an all around better database too.

For more information on the problem domain, check out