I have a database with three tables:
books
,authors
andbooks_authors
books_authors
is a “junction” table (i.e., it represents a conceptual level many-to-many relationship), so if it has a row with book_id=1
and author_id=5
, it means that author 5 is an author of book 1.
Now, the problem is that author names are sometimes written differently depending on the book. For example, in book 1 the name of author 5 is written “洋子”, but in book 2 the name of author 5 is written “ようこ”.
Sometimes these are variations of the same name (ようこ = 洋子) but sometimes they are completely different (e.g., an author changes his or her name after getting married.)
What's the best way to retain these name variations? Should I enter separate rows for the values “洋子” and “ようこ”, and then link them with another ID? Or should I make an array to keep all the variations in a single row?
Additional information
In response to some clarification requests made via comments, I am going to explain several significant aspects in order to contextualize the scenario more fully:
-
Authors can actually have six separate name values (First, First-reading, Last, Last-Reading, Full, Full-Reading) but the only one required is Full. The rest "can be" null.
-
An Author does not have a, say, "base name" in the business domain of interest.
-
I retain Author Names only when they are associated with a Book.
-
Authors can be Artists, Editors, etc. I use (what I call) the "junction" table to keep track of which Role they play for each Book.
-
I think that, usually, multiple Name Variations represent different values of the same domain, as in the case of “ようこ=洋子”, but I believe that sometimes they are different domains (e.g., as in the situation mentioned above, when an Author changes their Last Name after getting married).
-
I want to manipulate the corresponding tables via SELECT operations that, e.g., include the column(s) containing the Authors Names as (a) condition(s) in the WHERE clauses.
-
The full properties of interest regarding an Author are just the six Name (what I call) "fields", their Variants, and the Books with which they are associated.
-
The property whose values uniquely identify an Author instance is only the ID, as far as I can tell.
Note: I'm using Postgres but this question could apply to other relational database management systems.
Best Answer
I may just be repeating @jeffmc's answer, but I would add an
author_names
table that has a one-to-many relationship withauthors
, and then make your junction table (books_authors
) a junction ofbooks
andauthor_names
. (For clarity you may want to rename itbooks_author_names
.) This way you are able to group authors by all their various identities/pseudonyms while being to tell exactly which moniker was used in the publication of a specific book.I think this is a more precise way of doing it than using a time-based structure, as an author may use multiple identities simultaneously (e.g., Stephen King published both as Richard Bachman and Stephen King for several years).