Mysql – Should I use a compound or surrogate PK in a many-to-many table

database-designforeign keyMySQLperformanceprimary-key

I have a database:

DROP TABLE IF EXISTS `books`;

CREATE TABLE `books` (
  `isbn` VARCHAR(255) NOT NULL,
  `title` VARCHAR(255) NULL DEFAULT NULL,
  PRIMARY KEY (`isbn`)
) COMMENT 'Books used at this school';

DROP TABLE IF EXISTS `classes`;

CREATE TABLE `classes` (
  `class_id` INT(10) NOT NULL AUTO_INCREMENT,
  `teacher_id` SMALLINT(5) NULL DEFAULT NULL,
  PRIMARY KEY (`class_id`)
) COMMENT 'Classes at the school';

DROP TABLE IF EXISTS `b_c`;

CREATE TABLE `b_c` (
  `isbn` VARCHAR(255) NOT NULL,
  `class_id` INT(10) NOT NULL,
  PRIMARY KEY (`isbn`)
) COMMENT 'Books to classes';

ALTER TABLE `b_c` ADD FOREIGN KEY (isbn) REFERENCES `books` (`isbn`) 
    ON UPDATE CASCADE;
ALTER TABLE `b_c` ADD FOREIGN KEY (class_id) REFERENCES `classes` (`class_id`) 
    ON UPDATE CASCADE;

The issue I'm having is that I would like to normalize data as much as possible (I don't want multiple entries for the same relationship to be entered into the table b_c), but I would like to only store what data is absolutely pertinent.

My first idea to deal with this is to just create a compound primary key for the b_c table consisting of the fields isbn and class_id which would solve the issue of having duplicate relationships in the table, however, I have heard strong opinions on having a unique identifier for every row in a table like this. The justification for having a unique identifier for every row seems to be that it's useful to be able to specify a specific row, though I don't see a situation in which this would become useful. Can someone offer an example?

Another criticism I've heard is that using compound PKs in this way can make JOINs extremely taxing. Can someone comment on the performance of these two different methods?

The question boils down to "Is it worth it to add an id field to the b_c table or is the use of compound PKs enough to properly represent the relationship between the books and classes tables?

If you have any other comments about the design not directly pertaining to the question, I would love to hear them and thank you in advance for you help.

Best Answer

Well, it seems like all your data is meaningful, since you don't have a lot of it, and it all seems to have a role as a key or useful attribute.

If you have PK (by definition unique) on isbn in b_c, then this restricts a book to one class. Is that true? At that point you could argue the design that the class_id should simply then be an attribute of the book table and you don't even need the b_c table.

Since you already have a PK on b_c, I don't see the need for a surrogate key. Even if you were to expand to compound primary key on isbn, class_id to be able to allow multiple classes for a book, I don't really see a need for an additional surrogate unique key. In any case, it would only be an alternative unique key, I probably wouldn't make it the primary key and probably wouldn't use it in joins (joining to link tables is not terribly common as a foreign key, since they are usually identified by their parent or child as being part of a collection based on that relationship)

This is what I would do:

DROP TABLE IF EXISTS `books`;

CREATE TABLE `books` (
  `book_id` INT(10) NOT NULL AUTO_INCREMENT,
  `isbn` VARCHAR(255) NOT NULL,
  `title` VARCHAR(255) NULL DEFAULT NULL,
  PRIMARY KEY (`book_id`)
) COMMENT 'Books used at this school';

/* 
  Also consider a unique constraint here on isbn
  Also consider whether to allow NULL isbn when it isn't yet known
*/

DROP TABLE IF EXISTS `classes`;

CREATE TABLE `classes` (
  `class_id` INT(10) NOT NULL AUTO_INCREMENT,
  `teacher_id` SMALLINT(5) NULL DEFAULT NULL,
  PRIMARY KEY (`class_id`)
) COMMENT 'Classes at the school';

DROP TABLE IF EXISTS `b_c`;

CREATE TABLE `b_c` (
  `book_id` INT(10) NOT NULL,
  `class_id` INT(10) NOT NULL,
  PRIMARY KEY (`book_id`, `class_id`) -- note that book is no longer unique by itself
) COMMENT 'Books to classes';

ALTER TABLE `b_c` ADD FOREIGN KEY (book_id) REFERENCES `books` (`book_id`) ON UPDATE CASCADE;
ALTER TABLE `b_c` ADD FOREIGN KEY (class_id) REFERENCES `classes` (`class_id`) ON UPDATE CASCADE;