MySQL – Many to Many relationship, or one to many with additional columns

database-designerdjoin;MySQL

Our product contains some constraints that made us contemplate which table design will yield the best performance.

Our model:

Question table which contains fields like id, body, difficulty, and tags.
- Each question can have up to five different tags.
Tags table which contains fields like id, and tag name.

The question table will have significantly more reads than writes.

We considered two design options:

Many to many relations where we'll have the following tables

Question – TagsQuestion – Tags

One to many relation where we'll have the following tables

Question – Tags

In this option, the question table will contain five tags columns (TagA, TagB, etc)

The "read" queries will mostly fetch questions by tag or tags.

Best Answer

Basically you can only use

Question - TagsQuestion - Tags

With the bridge table, if one tags is set for many questions.

It wuld also save some space, if you have a lot of questions.

The sencond option has the problem, that if you wqant all tags t be displayed you need five INNER JOINs, for every Tag one, while you can make the same query with two joins, with the first option..

Overall, the first option with bridge table, is the more flexible and space consuming option.

Related Solutions

Database Design – Creating a Bookmark System

Designing for maximum performance for an anticipated large user-base, I would suggest ensuring bookmarks and tags are not duplicated unnecessarily. Using my suggested schema below, http://google.com/ would become a single bookmark row. Users who want a bookmark to Google would have a row in UserBookmarksTags referencing the Bookmark row and the associated tag row.

I would tend to abstract the UserID away from the Bookmarks table, and use a cross-reference table to determine which users have bookmarks, and what they've tagged each bookmark with.

Using SQL Server, I would do it like this:

USE tempdb;
GO
CREATE TABLE Users
(
    UserID INT NOT NULL 
        CONSTRAINT PK_Users
        PRIMARY KEY CLUSTERED
        IDENTITY(1,1)
    , UserName varchar(255)
);
CREATE TABLE Bookmarks
(
    BookmarkID INT NOT NULL
        CONSTRAINT PK_Bookmarks
        PRIMARY KEY CLUSTERED
        IDENTITY(1,1)
    , BookmarkName varchar(255)
    , BookmarkURL varchar(255)
);
CREATE TABLE Tags
(
    TagID INT NOT NULL
        CONSTRAINT PK_Tags
        PRIMARY KEY CLUSTERED
        IDENTITY(1,1)
    , TagName varchar(255)
);

If every entry in your system requires both a bookmark and a tag, you could use the following table for maximum performance:

CREATE TABLE UsersBookmarksTags
(
    UserID INT NOT NULL
        CONSTRAINT FK_UsersBookmarksTags_UserID
        FOREIGN KEY REFERENCES Users(UserID)
    , TagID INT NOT NULL
        CONSTRAINT FK_UsersBookmarksTags_TagID
        FOREIGN KEY REFERENCES Tags(TagID)
    , BookmarkID INT NOT NULL
        CONSTRAINT FK_UsersBooksmarksTags_BookmarkID
        FOREIGN KEY REFERENCES Bookmarks(BookmarkID)
    , CONSTRAINT PK_UsersBookmarksTags
        PRIMARY KEY CLUSTERED 
        (UserID, TagID, BookmarkID)
);

The above version of the table UsersBookmarksTags allows you to quickly return a list of bookmarks for a specific user that matches any given tag, or all bookmarks for that user.

Alternately, you could create the cross reference table that allows null entries for both Tags and Bookmarks, as in:

CREATE TABLE UsersBookmarksTags
(
    UsersTagsBookmarksID INT NOT NULL
        CONSTRAINT PK_UsersBookmarksTags
        PRIMARY KEY CLUSTERED
        IDENTITY(1,1)
    , UserID INT NOT NULL
        CONSTRAINT FK_UserBookmarks_UserID
        FOREIGN KEY REFERENCES Users(UserID)
    , TagID INT NULL
        CONSTRAINT FK_UserBookmarks_TagID
        FOREIGN KEY REFERENCES Tags(TagID)
    , BookmarkID INT NULL
        CONSTRAINT FK_UserBooksmarks_BookmarkID
        FOREIGN KEY REFERENCES Bookmarks(BookmarkID)
);

This allows rows that have a bookmark without a tag, and also allows tags to be associated to a user without a bookmark.

If there is a requirement to know all the tags a user has added, without those tags necessarily having an associated bookmark, I would consider adding another table for that purpose, as:

CREATE TABLE UserTags
(
    UserID INT NOT NULL
        CONSTRAINT FK_UserTags_UserID
        FOREIGN KEY REFERENCES Users(UserID)
    , TagID INT NOT NULL
        CONSTRAINT FK_UserTags_TagID
        FOREIGN KEY REFERENCES Tags(TagID)
    , CONSTRAINT PK_UserTags
        PRIMARY KEY CLUSTERED 
        (UserID, TagID)
);

PostgreSQL Database Design – Multiple Tables or Many Columns

If you have 100-150 columns, it is likely some of those are repeating groups. Those should be normalized into their own child tables (PK: user_id, sqn). Review your design. You should normalize to at least 3rd normal form.

Joins are usually very efficient, and in many cases only rows required for the result are joined. With that many columns, it is unlikely you would ever have to return all the columns, so you would likely be joining a subset of the tables.

You should be concerned with the cost of searching for matches on columns that aren't indexed. The higher the column number, the more expensive it is to find the column to compare it. This cost is likely to far outweigh the cost of joining.

The main reasons I use a one-to-one tables are:

Separate large tombstone data from volatile information.
Separate data with higher security requirements, from more public data.
Store sub-type data (one to zero-or-one).

Best Answer

Related Solutions

Database Design – Creating a Bookmark System

PostgreSQL Database Design – Multiple Tables or Many Columns

Related Question