How to handle variations of the same data point

I have a database with three tables:

books,
authors and
books_authors

books_authors is a “junction” table (i.e., it represents a conceptual level many-to-many relationship), so if it has a row with book_id=1 and author_id=5, it means that author 5 is an author of book 1.

Now, the problem is that author names are sometimes written differently depending on the book. For example, in book 1 the name of author 5 is written “洋子”, but in book 2 the name of author 5 is written “ようこ”.

Sometimes these are variations of the same name (ようこ = 洋子) but sometimes they are completely different (e.g., an author changes his or her name after getting married.)

What's the best way to retain these name variations? Should I enter separate rows for the values “洋子” and “ようこ”, and then link them with another ID? Or should I make an array to keep all the variations in a single row?

Additional information

In response to some clarification requests made via comments, I am going to explain several significant aspects in order to contextualize the scenario more fully:

Authors can actually have six separate name values (First, First-reading, Last, Last-Reading, Full, Full-Reading) but the only one required is Full. The rest "can be" null.
An Author does not have a, say, "base name" in the business domain of interest.
I retain Author Names only when they are associated with a Book.
Authors can be Artists, Editors, etc. I use (what I call) the "junction" table to keep track of which Role they play for each Book.
I think that, usually, multiple Name Variations represent different values of the same domain, as in the case of “ようこ=洋子”, but I believe that sometimes they are different domains (e.g., as in the situation mentioned above, when an Author changes their Last Name after getting married).
I want to manipulate the corresponding tables via SELECT operations that, e.g., include the column(s) containing the Authors Names as (a) condition(s) in the WHERE clauses.
The full properties of interest regarding an Author are just the six Name (what I call) "fields", their Variants, and the Books with which they are associated.
The property whose values uniquely identify an Author instance is only the ID, as far as I can tell.

Note: I'm using Postgres but this question could apply to other relational database management systems.

Best Answer

I may just be repeating @jeffmc's answer, but I would add an author_names table that has a one-to-many relationship with authors, and then make your junction table (books_authors) a junction of books and author_names. (For clarity you may want to rename it books_author_names.) This way you are able to group authors by all their various identities/pseudonyms while being to tell exactly which moniker was used in the publication of a specific book.

I think this is a more precise way of doing it than using a time-based structure, as an author may use multiple identities simultaneously (e.g., Stephen King published both as Richard Bachman and Stephen King for several years).

Best Answer

Related Solutions

Entity-Relationship Diagram for a scenario involving Books, Authors and Countries

How should I best design the tables and relationships, given the following rules

Related Question