Mysql – How to model a three-way association that involves Product, Category and Label

database-designforeign keymany-to-manyMySQLprimary-key

I have the following three tables:

products:
    product_id,
    product_name,
    ...

categories:
    category_id,
    category_name,
    category_parent_id,
    category_priority (for sort ordering),
    .....

labels:
    label_id,
    label_name,
    .....

The idea is that products assigned to a category will be grouped within each category by labels and listed on the website in this manner:

---label1---
product_1
product_2
product_3
---label2---
product_4
product_5
---label3---
product_6
product_7
product_8
product_9
etc.

I can’t figure out how to design an association table (or tables) that glue all this together and prevent anomalies like this:

---label1---
product_1
product_2
---label2---
product_2
product_3

At the same time, I’d like to allow a situation where there will be no labels assigned to a category when there are not enough products to justify it.

Questions

Is it possible to design a structure that will hold it together or should I ‘abandon all hope’ and go with something like this:

categories:
    category_id,
    category_name,
    category_parent_id,
    categor_is_label,
    category_priority

products:
    product_id,
    product_name,
    ...

plus the association table that follows:

categories_products:
    category_id,
    product_id,
    priority

and handle all logic and checking for anomalies in the application?

I’m assuming that users won’t have direct access to the database.

Comments and chat interactions

For those interested in an in-depth conversation about the business context at hand, you can visit this chat room.

Best Answer

Note: This answer presents an approach to cover specifically the business rules derived by way of the earliest series of comment and chat interactions (most of which can be seen in this question revision). Nevertheless, after more in-depth clarifications and deliberations took place, @yemet indicated that the business context may require a somewhat distinct method, due to the successive identification of dfferent business rules.


The fact that you have determined that there exists a three-way association (also known as ternary or diamond shaped relationship) that involves the entity types (tables once implemented) of interest shows that you are heading in the right direction.

Business rules

The objective should be to handle the three distinct relationships at hand separately, starting with a logical level analysis before considering the implementation aspects in full. In this regard, it is quite helpful to write down some formulations that describe the relevant business rules, e.g.:

First, for the following many-to-many (M:N) relationship:

  • A product is classified by one-to-many categories
  • A category classifies zero-one-or-many products

Which implies the existence of an associative entity type that I am going to call product_category.

Second, for a distinct M:N relationship:

  • A category is integrated by zero-one-or-many labels
  • A label integrates zero-one-or-many categories

Situation that suggests that there is another associative entity type, in this case the one that I will name category_label.

Then, it is time to manage another M:N relationship, this time between the two associative entity types discussed above:

  • A product_category may receive zero-one-or-many label_assignments
  • A category_label may take part in zero-one-or-many label_assignments

As noted, I have included a new entity type that I denominated label_assignment but, naturally, you may name it using a term that is more meaningful with respect to your business domain.

I have assumed, based on the structure of the categories table contained in your question (specifically the column categories.category_parent_id), that there is a self-recursive one-to-many (1:M) relationship concerning the entity type called category. Later, you confirmed such situation, so the following rule applies as well:

  • A category comprises zero-one-or-many categories

Logical Model

Then I have derived an IDEF1X1 logical model from the business rules formulations presented above, that is shown in Figure 1:

Figure 1 - Product Classification Data Model

With this arrangement you can solve much of your needs, since:

  • Each product has to first be related to a certain category before it receives a label assignment.
  • A label cannot be assigned to a particular product if it has not been connected to a certain category previously.
  • A product can be related to a certain category without having to be involved in a relationship with a label.

Expository DDL structure

Consequently, I coded the following DDL structure (tested on SQL Fiddle):

-- You should determine which are the most fitting 
-- data types and sizes for all your table columns 
-- depending on your business context characteristics.

-- As one would expect, you are free to make use of 
-- your preferred (or required) naming conventions.

CREATE TABLE product
(
     product_id       INT      NOT NULL,
     product_code     CHAR(30) NOT NULL,
     name             CHAR(30) NOT NULL,
     description      CHAR(90) NOT NULL,
     created_datetime DATETIME NOT NULL,
     CONSTRAINT PK_product             PRIMARY KEY (product_id),
     CONSTRAINT AK_product_code        UNIQUE      (product_code), -- (Possible?) ALTERNATE KEY.
     CONSTRAINT AK_product_name        UNIQUE      (name),         -- ALTERNATE KEY.
     CONSTRAINT AK_product_description UNIQUE      (description)   -- ALTERNATE KEY.
);

CREATE TABLE category
(
    category_number         INT      NOT NULL,
    parent_category_number  INT      NULL, -- Set up as ‘NULLable’, in order to focus on the main aspects of the approach exposed.
    name                    CHAR(30) NOT NULL,  
    description             CHAR(90) NOT NULL,
    created_datetime        DATETIME NOT NULL,
    CONSTRAINT PK_category                         PRIMARY KEY (category_number),
    CONSTRAINT AK_category_name                    UNIQUE      (name),        -- ALTERNATE KEY.  
    CONSTRAINT AK_category_description             UNIQUE      (description), -- ALTERNATE KEY.
    CONSTRAINT FK_FROM_category_TO_parent_category FOREIGN KEY (parent_category_number)
        REFERENCES category  (category_number)
);

CREATE TABLE label
(
    label_number     INT      NOT NULL,
    name             CHAR(30) NOT NULL,    
    description      CHAR(90) NOT NULL,  
    created_datetime DATETIME NOT NULL,
    CONSTRAINT PK_label             PRIMARY KEY (label_number),
    CONSTRAINT AK_label_name        UNIQUE      (name),       -- ALTERNATE KEY.
    CONSTRAINT AK_label_description UNIQUE      (description) -- ALTERNATE KEY.  
);

CREATE TABLE product_category -- Associative table.
(
    product_id          INT      NOT NULL,
    category_number     INT      NOT NULL, 
    classified_datetime DATETIME NOT NULL,  
    CONSTRAINT PK_product_category                  PRIMARY KEY (product_id, category_number),
    CONSTRAINT FK_FROM_product_category_TO_product  FOREIGN KEY (product_id)
        REFERENCES product  (product_id),
    CONSTRAINT FK_FROM_product_category_TO_category FOREIGN KEY (category_number)
        REFERENCES category (category_number)
);

CREATE TABLE category_label -- Associative table.
(
    category_number     INT      NOT NULL,
    label_number        INT      NOT NULL,   
    integrated_datetime DATETIME NOT NULL,  
    CONSTRAINT PK_category_label                  PRIMARY KEY (category_number, label_number),
    CONSTRAINT FK_FROM_category_label_TO_category FOREIGN KEY (category_number)
        REFERENCES category (category_number),
    CONSTRAINT FK_FROM_category_label_TO_label    FOREIGN KEY (label_number)
        REFERENCES label    (label_number)
);

CREATE TABLE label_assignment -- Associative table that ‘concretizes’ a relationship between two distinct relationships.
(
    product_id        INT       NOT NULL,
    category_number   INT       NOT NULL,
    label_number      INT       NOT NULL,  
    assigned_datetime DATETIME  NOT NULL,
    CONSTRAINT PK_label_assignment                          PRIMARY KEY (product_id, category_number, label_number), -- Composite PRIMARY KEY.
    CONSTRAINT FK_FROM_label_assignment_TO_product_category FOREIGN KEY (product_id, category_number)   -- Composite FOREIGN KEY.
        REFERENCES product_category (product_id, category_number),
    CONSTRAINT FK_FROM_label_assignment_TO_category_label   FOREIGN KEY (category_number, label_number) -- Composite FOREIGN KEY.
        REFERENCES category_label   (category_number, label_number)
 );

Pay especial attention to the two composite FOREIGN KEY definitions of the label_assignment table, because the category_number attribute is included in both of them.

You brought up a requirement that stipulates that:

A product has to be assigned to category in order to be visible in the shop.

Therefore, you should guarantee that each time that you INSERT a product row you as well link it with a certain category by means of the INSERTion of a row in the associative table here called product_category. In this way, both operations should be executed within the same ACID TRANSACTION, so that they either succeed or fail as a single unit.

Similar scenarios

You might find of help my answer to

and also the @Ypercubeᵀᴹ answer to


Endnote

1. Integration Definition for Information Modeling (IDEF1X) is a highly recommendable data modeling technique that was established as a standard in december 1993 by the United States National Institute of Standards and Technology (NIST). It is solidly based on (a) theoretical work authored by the originator of the Relational Model, i.e., Dr. E. F. Codd; on (b) the Entity-Relationship view, developed by Dr. P. P. Chen; and also on (c) the Logical Database Design Technique, created by Robert G. Brown. It is worth noting that IDEF1X was formalized by way of first-order logic.

Related Question