MySQL – Table with Multiple Foreign Keys vs Separate Tables for Each FK

MySQLperformancequery-performance

I have tables called 'books', 'category', 'subcategory', and 'booksCategory'. Each book can have its own category and subcategories.
For example, a Sci-Fi book can belong in the 'category' table under sci-fi, but it also may be under the subcategory under 'Aliens'.

  • Table books – a list of books
  • Table Category – a list of categories. e.g. Sci-Fi
  • Table Subcategory – a list of subcategories. e.g. Aliens
  • Table booksCategory – a list of books under categories and subcategories

I wasn't sure of the most optimal way of creating booksCategory.
booksCategory will have a FK referencing the ids of both Category and Subcategory tables.

CREATE TABLE booksCategory(
  booksCategory_id INT NOT NULL AUTO_INCREMENT primary key,
  category_id INT references category (category_id),
  subcategory_id INT references subcategory (subcategory_id)
)

so the results would be something like this

booksCategory_id | category_id | subcategory_id | books_id
----------------------------------------------------------
     1                 3            4                 1     // 3 = SCI-FI, 4 = Aliens
     2                 2            NULL              1    // 2 = Adventure, Null = does not belong to any subcategory

The problem occurs when a book falls under multiple categories. In the example above, book_id with 1 can be both SCI-FI and Adventure. Is this the best method because then I will have a NULL in subcategory?

My other option I can think of is to have separate tables for booksCategory, booksSubcategory, and if I have more subcategories, bookSubcategories2. So basically a separate list of books that fall under Category, another list of books that fall under subcategory, and if I have more branching subcategories, a list of books that fall under that subcategory. I'm not really too fond of this method because it seems less organized/scattered.

Best Answer

Because you desire a categorization hierarchy of indeterminate depth, you will have to use a single category table structured along these lines:

CREATE TABLE Category (
    Category_ID           INT NOT NULL AUTO_INCREMENT primary key,
    Parent_Category_ID    INT NOT NULL REFERENCES Category(Category_ID),
    Description           varchar(255) not null
)

This type of structure is not as well supported by MySQL as some other RDBMS's, but these issues can all be worked through.

You will also need a revised BookCategory mapping table such as this:

CREATE TABLE BookCategory (
    Book_ID        INT NOT NULL references Book(Book_ID),
    Category_ID    INT NOT NULL references Category(Category_ID)
)

My apologies if the Foreign Key syntax above is not quite correct for MySQL; I usually work with SQL Server.

Update With this INSERT:

insert Category(Parent_Category_ID,Description)
values (1,'Mystery')
      ,(2,'SciFi')
      ,(2,'Aliens')
      ,(3,'Humanoid')
      ,(3,'Reptilian')
      ,(3,'Other')
;

the following SELECT:

select
    root.Category_ID,
    cast(
      coalesce(grandparent.Description + '-',' ') +
      coalesce(parent.Description + '-',' ') +
            root.Description 
    as varchar(255)) as Description
from Category                           root
left join Category                      parent
   on parent.Category_ID            = root.Parent_Category_ID
  and root.Parent_Category_ID      <> root.Category_ID
left join Category                      grandparent
   on grandparent.Category_ID       = parent.Parent_Category_ID
  and parent.Parent_Category_ID    <> parent.Category_ID

yields this:

Category_ID Description
----------- 
1             Mystery
2             SciFi
3            SciFi-Aliens
4           SciFi-Aliens-Humanoid
5           SciFi-Aliens-Reptilian
6           SciFi-Aliens-Other

When Common Table Expressions are supported (not yet in MySQL I understand) the following will provide automatic nesting of the hierarchy levels:

with data as (
    select
        Category_ID,
        cast(Description as varchar(255)) as Description
    from Category
    where Parent_Category_ID = Category_ID

    union all

    select
        child.Category_ID,
        cast(parent.Description + '-' + child.Description as varchar(255)) as Description
    from data       parent
    join Category   child
       on child.Parent_Category_ID  = parent.Category_ID
    and child.Parent_Category_ID <> child.Category_ID
)
select 
     Category_ID,
     Description
from data