Representing N:N relation as a functional dependency in a database design

database-design

As a software developer, I already have some experience in designing a more or less normalized database schema, but I haven't received any formal training in it before. This semester, I took a university class on databases. We are being taught the formal way of designing a schema based on relational algebra. First, we collect the attributes we want to store:

bookstore<TITLE, AUTHOR, CATEGORY, YEAR, PUBLISHER, PUBLISHER_ADDRESS>

Then we find functional dependencies between the attributes:

fbook : { TITLE, AUTHOR } -> { CATEGORY, YEAR, PUBLISHER, PUBLISHER_ADDRESS }
fpublisher : { PUBLISHER } -> { PUBLISHER_ADDRESS }

After that we can normalize this schema. (For the sake of simplicity, I assumed that a book is defined by its title and author alone, no two editions or two copies of the same edition are stored.)

Now what happens when a book can have more than one category? How do I represent that relation with dependencies? The category can't be a secondary attribute any more, but if it's primary, how do I proceed?

What does this define?

{ TITLE, AUTHOR, CATEGORY } -> ???

We are told that the empty set cannot be on the right side of a dependency.

Best Answer

The question you raise has to do with the definition of first normal form (1NF). Whether the answer directly involves functional dependencies depends in part on the definitions you accept. Wikipedia has a fairly simple article about 1NF.

title                                author    year  category   
--
An Introduction to Database Systems  CJ Date   2003  databases, modeling, storage, retrieval

If you look at the column "category" one way, it contains a single value. Depending on your dbms and your design, that value might be the string "databases, modeling, storage, retrieval", or it might be the array "{databases, modeling, storage, retrieval}".

If you look at the column "category" another way, it contains four values. Those values are the four strings "databases", "modeling", "storage", and "retrieval".

In database design, the solution is to use two tables. But I don't think you can decompose the "bad" table by projection (which CJ Date identifies as the decomposition operator), because projection doesn't split the content of a column into multiple rows. (Projection doesn't give you four rows from the single value "databases, modeling, storage, retrieval", which is what you need to do. "Join", the recomposition operator, doesn't yield a single value like "databases, modeling, storage, retrieval", either.)

The inability to decompose by projection suggests that the solution to this problem doesn't have to do with functional dependencies. The resulting table would have three attributes, {title, author, category}, the only key would also be {title, author, category}, and that table would be in 5NF.