Database Design – Creating a One-to-One Relationship Correctly

database-design

Imagine the following situation:

> A person has a passport.
> A person *owns only* one passport.
> One passport can only *be owned by a single* person. 

This is a clear case of a one-to-one relationship. For the sake of simplicity let us imagine that a person only has a name and the passport only has a nationality. What is bothering me the most is that it seems that everyone is doing this differently. From what I can tell there are four strategies that can be taken to map this relationship:

  1. Everything on the same table:

  1. A foreign key on the owner side that references the owned side.

enter image description here

  1. Same value for primary keys on both tables.

enter image description here

  1. A foreign key on the owned side with a unique key on top:

enter image description here

  • Number 1 is pretty self explanatory and I do not see any problem when
    the entities are small, but if they have a lot of fields we will have
    a gigantic table.
  • Number 2 seems fine and this is the way I have seen most people and frameworks do it (e.g. Entity Framework, Hibernate) and it is also the way I was taught in school. The big problem here is that, because of referential integrity, our deletion logic will be upside down. I should be able to delete a passport without any problem, but in this case I will not be able to delete it without deleting the person as well, which does not make any sense.
  • Number 3 and 4 seem pretty much identical. With the foreign key on the passport side, I am able to delete passports without deleting people, which makes sense. If I remove a person, I should have to remove their passport. The main advantage that I see for using option number 4 instead of number 3 or all the others, is that if for ever reason I decide that I now want a user to have multiple passports I only have to remove the unique key constraint, which is incredibly easy and a lot less hassle than changing the keys from both tables.

So my question basically is reduced to the following: With so many advantages with number 4 why do people keep using the other strategies? My biggest grip is with ORMs such as Hibernate which, in my opinion, do things the opposite way. The owning side of the relation tracked by Hibernate is the side of the relation that owns the foreign key in the database. So if I tried to do this in Java with Hibernate my relationship would be swapped. If I included the foreign key to the passport on the person table like it wants me too, it would ruin the deletion logic like I have explained before. I have the impression that EF also works this way. So, with all of this, why do people keep preferring the "disadvantageous" approaches?

Best Answer

The most important thing is to word constraints precisely and not allow for logical errors. Options 1, 2, 3 contain logical errors. Option 4 is close, but may not represent reality very good -- in general one person may have more than one passport (dual citizenship).

Option 1 is not realistic, because it states that there can not exist a person without a passport.

Option 2 essentially allows a passport to exist without a person..

Option 3 states that passport is a person.

Option 4 is the closest and can be worded as:

[P1] Person (PERSON_ID) named (NAME) exists.

(c1.1) Person is identified by PERSON_ID.

(c1.2) Each person has exactly one name; for each name, more than one person can have that name.

[P2] Passport (PASSPORT_ID) issued by country (COUNTRY) is owned by person (PERSON_ID)

(c2.1) Passport is Identified by PASSPORT_ID.

(c2.2) Each passport is issued by exactly one country; for each country, more than one passport can be issued by that country.

(c2.3) Each passport is owned by exactly one person; for each person that person may own at most one passport.

(c2.4) If a passport issued by a country is owned by a person then that person must exist.

person {PERSON_ID, NAME}  -- p1
   KEY {PERSON_ID}        -- c1.1



passport {PASSPORT_ID, COUNTRY, PERSON_ID}  -- p2
     KEY {PASSPORT_ID}                      -- c2.1
     KEY {PERSON_ID}                        -- c2.3

FOREIGN KEY {PERSON_ID} REFERENCES person {PERSON_ID} -- c2.4

Note:

[Px]   = predicate x
[cx.y] = constraint x.y

KEY = PK or AK

PK  = PRIMARY KEY
AKn = ALTERNATE KEY (UNIQUE)
FKn = FOREIGN KEY

All attributes NOT NULL

EDIT

Just to be clear, option 3 would have been the correct choice had you chosen a different example. Say, Person & Employee or Employee & Accountant instead of Person & Passport. Because an employee is a person, and an accountant is an employee. The is-a relationship implies a proper subset.

This is a great example of how focusing on technical details may introduce logical errors, and there is no such a thing as a small logical error.