Mysql – Advice on a basic design, first time database design

database-designforeign keyMySQL

I am taking a course to become a Java developer.

The course involves using databases but we never really design any, unfortunately.

Most of the time we get pre-made databases and we have to implement code on it to Insert, Update, Read or Delete data.

But when my final test comes, chances are I will be making something that involves a database and therefore I want to try designing a few smaller ones to get the hang of a designing since I noticed that a good database design makes a lot of difference when writing the code to work with it.

I hope these types of questions are allowed here and that it's not going to be too broad.


This is a small design I made for something a simple as clubs and members with addresses and phonenumbers.

  • There will be multiple clubs.
  • Each club will have multiple members (obvious)
  • A member cannot be part of multiple clubs
  • A member can have multiple phonenumbers and email addresses
  • A member can only have one address
  • An address can belong to different members (couples or siblings)

My biggest confusions:

  • If I want to add a column in Club that describes the owner, who will also be a member, what is the best approach without having the
    same member listed twice?

  • Should I put all my tables on the automatic increment of an id or would this be a bad idea? (benefits/downsides?)

  • If I add foreign keys in the foreign keys tab, will these automatically correspond to the correct table or do I have to add
    these to the columns also? (see picture 2)

  • And my probably noobiest question at all… Do I put the foreign keys of phonenumber and email in their respective table
    linking to a person_Id or should I put phoneNumber and email Id's in
    the person table?

(My apologies for the language in the pictures not being English, I hope this isn't too much of a problem)

Design

Foreign keys

Best Answer

Responses to your individual questions

If I want to add a column in Club that describes the owner, who will also be a member, what is the best approach without having the same member listed twice?

If —as stated in your specifications— a Person can be a Member of only one Club, and you are interested in storing this datum simply as true or false, one option is adding a BIT(1) or a BOOLEAN (TINYINT) column to the Person table, and you may wish to call this column IsClubOwner. In this way, you can register the fact that a determined person is a club owner exclusively one time. Apart from that, this method also allows the possibility of retaining several people as owners of the same club occurrence. You can see a logical level depiction of this approach in Figure 1.

However, you are looking for the best approach and, according to my experience, such approach entails the development of a much more expandable and versatile structure. In this respect, follow the progression of a modeling exercise for these and other points below, in the sections entitled “Covering your remaining specifications”, “Person as Member of multiple Clubs” and “Member and Owner as separate entity types”.

Should I put all my tables on automatic increment of an id or would this be a bad idea? (benifits/downsides?)

If you face an explicit requirement that indicates the definition of a table with a column of such characteristics, and that column has a valid contextual meaning or serves a particular purpose like being a surrogate for a wide natural PRIMARY KEY (PK), then yes, you should proceed that way.

Otherwise, if you do not have said requirement, I consider it would be unnecesary since you have to store and manage a meaningless extra column and (perhaps?) also an additional INDEX in your database.

As usual, you have to analyze each case along with its overall repercussions in order to decide how to carry on.

If I add foreign keys in the foreign keys tab, will these automatically correspond to the correct table or do I have to add these to the columns also?

In this regard, my advice for you is to create your database structure manually, to code your own DDL statements until you get a firm grasp of the subject. If you do so, it will be easier for you to understand the processes that graphical tools are performing “under the hood”.

To me, for instance, creating a statement like the following:

CONSTRAINT FK_PersonPhoneNumber_to_Person FOREIGN KEY (PersonId)
REFERENCES Person (PersonId)

Is much more instructive than using GUI tools to execute this kind of tasks, especially now that you are building your first designs.

And my probably most noobiest question of all... Do I put the foreign keys of phone_number and email in their respective table linking to a person_id or should I put phone_number and email ids in the person table?

Personally, I think that this and all your other questions are completely valid and well contextualized.

Returning to the technical aspects that concern us, this precise inquiry offers a good opportunity to review the two following assertions:

  • A Person can be reached through zero-one-or-many PhoneNumbers
  • A PhoneNumber can be used to reach one-to-many People

So, one can conclude that there is a many-to-many relationship between Person and PhoneNumber, therefore, I suggest the creation of an associative table named PersonPhoneNumber to represent said relationship in your database. The PK of this table should be composed by two different columns: PersonId (a FOREIGN KEY [FK] pointing to Person.PersonId) and PhoneNumber (a FK that makes reference to PhoneNumber.Number). For a logical level description of all of the above, see Figure 1.

On the other hand, let us examine the two propositions that follow:

  • A Person can be contacted via zero-one-or-many EmailAddresses
  • An EmailAddress can be used to contact exactly one Person

Then, yes, you should set a FK referencing to Person from the EmailAddress table, and this table should have a composite PK too, which would be comprised of the columns PersonId and Address. In this manner, you can ensure that the same combination of EmailAddress.PersonId and EmailAddress.Address can be inserted only once.

If you also wish to ensure that a given EmailAddres.Address can be stored in one sole row, you just have to establish a UNIQUE CONSTRAINT for this column.

Proposed logical data models

In order to expose my suggestions more clearly, I included four distinct IDEF1X[1] logical models that are shown in Figure 1, Figure 2, Figure 3 and Figure 4. I will provide an explanation of the most relevant features displayed in each one of them in the corresponding sections. You can as well download from Dropbox a PDF that integrates in a single model the majority of the elements under discussion.

Covering your remaining specifications

Relating People and Addresses

Let us inspect the two following (slightly reworded) assertions that are relevant to people and addresses:

  • A Person can only have one Address
  • An Address can belong to different People (couples or siblings)

So, to deal with these restrictions, you may opt to implement a data model similar to the one you can see in Figure 1.

Fig. 1. Clubs and Members Data Model - First Specifications

Taking into account the referred model, you should follow the next steps:

  1. Create a table called PersonAddress fixing a relationship between Person and Address. Set the columns PersonId and AddressId as the compound PK of this table.

  2. Configure a UNIQUE CONSTRAINT for the PersonAddress.PersonId column to ensure that a specific value can be inserted in at most one row of said table. At the logical level, this circumstance implies that PersonAddress.PersonId has become an ALTERNATE KEY[2].

  3. If the value of AddressId in a determined PersonAddress insertion attempt has not being already stored, then let the insertion continue, otherwise, when such value already exists in a row, you have to check that (a) the PersonId who has registered that AddressId is also registered as the Marriage.WifeId if the PersonId is a male (datum derived via the Person.GenreCode) or (b) that the PersonId is the Marriage.HusbandId when the PersonId is a female (derived by virtue of Person.GenreCode, as well). If one of these conditions is met in the appropiate situation, then you should let the INSERT go on.

  4. If the above conditions were not fulfilled, there is still a chance for a PersonAddress insertion try to be succesful. You have to check that the PersonId value that is involved in said insertion try shares at least one Progeny.ParentId with the PersonId that has already registered the PersonAddress.AddressId. If this condition is satisfied, then it means that they are stored as Siblings in the database, so this INSERT must succeed.

As in any relational database implementation, you should seriously consider executing your DML operations inside ACID Transactions so that you can protect the integrity and consistency of the data you are working with.

Attending the requirements you added in comments: Addresses and Phone Numbers serving equally Clubs and People

On condition that you want to give the chance for addresses and phone numbers to serve both people and clubs, you might make use of a supertype-subtype relationship. Here is an answer in which I give a more detailed treatment to this kind of structures, in case you are interested.

In the present scenario, you could define Person and Club as subtypes of a new entity named Party, a term commonly used in legal circles to stand for (a) a person or (b) a group of persons (as noted in sense no. 6). With this method, the relationships between Addresses (or PhoneNumbers) and People and Clubs would be defined through Party, the supertype. See Figure 2 for a depiction of this suggestion.

Fig. 2. Clubs and Members Data Model - Second Specifications

Party and Address

So we can read in this new model that:

  • A Party keeps zero-one-or-many Addresses
  • An Address is kept by one-to-many Parties

Thus, there is a many-to-many relationship involving Party and Address that is expressed by way of the PartyAddress entity.

Party and PhoneNumber

Furthermore, we can interpret that:

  • A Party is reached through zero-one-or-many PhoneNumbers
  • A PhoneNumber is used by one-to-many Parties

That is why I added the PartyPhoneNumber entity that describes the many-to-many association that takes effect between the Party and PhoneNumber entity types.

Party and Club or Person

Then, it can also be read that:

  • A Party is either a Club or a Person

Hence, Party supplies a connection from either Clubs or People to Addresses (or PhoneNumbers).

Person as Member of multiple Clubs

As @aldwinaldwin mentiones in his answer, if you want to provide the functionality for a person to be a member of multiple clubs, then you can include a table called ClubMember that would be acting as another many-to-many relaltionship, this time, naturally, interconnecting Person and Club.

I am going to add to the above that this table can also be useful in the goal of storing any concrete person as owner of multiple clubs, by means of the inclusion of the already mentioned IsClubOwner boolean column. In fact, one can say that this new table is the representation of an integral entity type in its own right.

As demonstrated in Figure 3, such table requires a composite PK made up of the columns ClubId and MemberId (a role name[3] given to PersonId), and these columns have to be defined also as FKs pointing, correspondingly, to Club and Person.

Fig. 3. Clubs and Members Data Model - Person as Member of Multiple Clubs

A more adaptable structure

Employing this setting you can also comply with the initial rule that states that a Person can be a member of only one Club, but you just have to add a UNIQUE CONSTRAINT to the MemberId column, so that a certain value can be entered in no more than one occasion. So, as you can note, this structure is much more adaptable that the one shown in Figure 1, since by dropping the UNIQUE CONSTRAINT (or INDEX) you would open the aforementioned functionality of permitting a person to become a member of different clubs at the same time.

Member and Owner as separate entity types

As you know, the storage several facts about the role carried out by a person as owner of a club —besides its mere existance in a true or false attribute— can be very advantageous. For example, you may want to keep the effective date in which a definite person became the owner of a club, hence you can manage this situation by introducing a separate entity type called ClubOwner, as presented in Figure 4.

Fig. 4. Clubs and Members Data Model - Member and Owner as Separate Entity Types

Once you construct a table based on this new entity type, you can add the fitting columns that represent the characteristics that come into play exclusively when a Person is the Owner of a Club. As depicted, this table would hold a PK composed of the FK columns referencing Person.PersonId and Club.ClubId, this way any combination of ClubOwner.OwnerId (or ClubOwner.PersonId, if you prefer) and ClubOwner.ClubId can be inserted in just one opportunity.

Of course, with this configuration you can still derive in boolean form if a Person is the Owner of a particular Club with the aid of a query that returns a scalar value which can be evaluated to true or false.


Notes

1. Integration Definition for Information Modeling (IDEF1X) is a highly recommendable data modeling technique that was defined as a standard in december 1993 by the United States National Institute of Standards and Technology (NIST). It is solidly based on (a) some of the theoretical papers authored by the originator of the Relational Model, i.e., Dr. E. F. Codd; on (b) the Entity-Relationship theory, developed by Dr. P. P. Chen; and also on (c) the Logical Database Design Technique, created by Robert G. Brown. It is worth noting that IDEF1X was formalized by way of first-order logic.

2. An ALTERNATE KEY is an attribute (or a combination of attributes) that holds values that uniquely identify an entity occurrence but was not chosen as the PK of the pertinent entity type; each entity type can have zero, one or more ALTERNATE KEYS. In an IDEF1X model, they are are indicated as “AK” plus its respective number, e.g., AK1, AK2, etc. They are usually implemented in a SQL DDL structure via a UNIQUE CONSTRAINT (or a UNIQUE INDEX).

3. Role names are denotations (or aliases) assigned to FK attributes in order to express the meaning that they hold within the scope of their respective entity. Their usage is recommended since 1970 by Dr. Codd in his seminal paper entitled “A Relational Model of Data for Large Shared Data Banks”. For its part, IDEF1X —keeping fidelity regards relational practices— also advocates role naming.