The proper structure for this scenario is a SubClass / Inheritance model, and is nearly identical to the concept I proposed in this answer: Heterogeneous ordered list of value.
The model proposed in this question is actually quite close in that the Animal
entity contains the type (i.e. race
) and the properties that are common across all types. However, there are two minor changes that are needed:
Remove the Cat_ID and Dog_ID fields from their respective entities:
The key concept here is that everything is an Animal
, regardless of race
: Cat
, Dog
, Elephant
, and so on. Given that starting point, any particular race
of Animal
doesn't truly need a separate identifier since:
- the
Animal_ID
is unique
- the
Cat
, Dog
, and any additional race
entities added in the future do not, by themselves, fully represent any particular Animal
; they only have meaning when used in combination with the information contained in the parent entity, Animal
.
Hence, the Animal_ID
property in the Cat
, Dog
, etc entities is both the PK and the FK back to the Animal
entity.
Differentiate between types of breed
:
Just because two properties share the same name does not necessarily mean that those properties are the same, even if the name being the same implies such a relationship. In this case, what you really have is actually CatBreed
and DogBreed
as seperate "types"
Initial Notes
- The SQL is specific to Microsoft SQL Server (i.e. is T-SQL). Meaning, be careful about datatypes as they are not the same across all RDBMS's. For example, I am using
VARCHAR
but if you need to store anything outside of the standard ASCII set, you should really use NVARCHAR
.
- The ID fields of the "type" tables (
Race
, CatBreed
, and DogBreed
) are not auto-incrementing (i.e. IDENTITY in terms of T-SQL) because they are application constants (i.e. they are part of the application) that are static lookup values in the database and are represented as enum
s in C# (or other languages). If values are added, they are added in controlled situations. I reserve the use of auto-increment fields for user data that comes in via the application.
- The naming convention I use is to name each subclass table starting with the main class name followed by the subclass name. This helps organize the tables as well as indicates clearly (without looking at the FKs) the relationship of the subclass table to the main entity table.
- Please see "Final Edit" section at the end for a note regarding Views.
"Breed" as "Race"-Specific Approach
![Breed As Race-specific Diagram](https://i.stack.imgur.com/GrBRU.jpg)
This first set of tables are the lookup / types tables:
CREATE TABLE Race
(
RaceID INT NOT NULL PRIMARY KEY
RaceName VARCHAR(50) NOT NULL
);
CREATE TABLE CatBreed
(
CatBreedID INT NOT NULL PRIMARY KEY,
BreedName VARCHAR(50),
CatBreedAttribute1 INT,
CatBreedAttribute2 VARCHAR(10)
-- other "CatBreed"-specific properties as needed
);
CREATE TABLE DogBreed
(
DogBreedID INT NOT NULL PRIMARY KEY,
BreedName VARCHAR(50),
DogBreedAttribute1 TINYINT
-- other "DogBreed"-specific properties as needed
);
This second listing is the main "Animal" entity:
CREATE TABLE Animal
(
AnimalID INT NOT NULL IDENTITY(1, 1) PRIMARY KEY,
RaceID INT NOT NULL, -- FK to Race
Name VARCHAR(50)
-- other "Animal" properties that are shared across "Race" types
);
ALTER TABLE Animal
ADD CONSTRAINT [FK_Animal_Race]
FOREIGN KEY (RaceID)
REFERENCES Race (RaceID);
This third set of tables are the complimentary sub-class entities that complete the definition of each Race
of Animal
:
CREATE TABLE AnimalCat
(
AnimalID INT NOT NULL PRIMARY KEY, -- FK to Animal
CatBreedID INT NOT NULL, -- FK to CatBreed
HairColor VARCHAR(50) NOT NULL
-- other "Cat"-specific properties as needed
);
ALTER TABLE AnimalCat
ADD CONSTRAINT [FK_AnimalCat_CatBreed]
FOREIGN KEY (CatBreedID)
REFERENCES CatBreed (CatBreedID);
ALTER TABLE AnimalCat
ADD CONSTRAINT [FK_AnimalCat_Animal]
FOREIGN KEY (AnimalID)
REFERENCES Animal (AnimalID);
CREATE TABLE AnimalDog
(
AnimalID INT NOT NULL PRIMARY KEY, -- FK to Animal
DogBreedID INT NOT NULL, -- FK to DogBreed
HairColor VARCHAR(50) NOT NULL
-- other "Dog"-specific properties as needed
);
ALTER TABLE AnimalDog
ADD CONSTRAINT [FK_AnimalDog_DogBreed]
FOREIGN KEY (DogBreedID)
REFERENCES DogBreed (DogBreedID);
ALTER TABLE AnimalDog
ADD CONSTRAINT [FK_AnimalDog_Animal]
FOREIGN KEY (AnimalID)
REFERENCES Animal (AnimalID);
The model using a shared breed
type is shown after the "Additional Notes" section.
Additional Notes
- The concept of
breed
seems to be a focal point for confusion. It was suggested by jcolebrand (in a comment on the question) that breed
is a property shared across the different race
s, and the other two answers have it integrated as such in their models. This is a mistake, however, because the values for breed
are not shared across the different values of race
. Yes, I am aware that the two other proposed models attempt to solve this issue by making race
a parent of breed
. While that technically solves the relationship issue, it doesn't help solve the overall modeling question of what to do about non-common properties, nor how to handle a race
that does not have a breed
. But, in the case that such a property were guaranteed to exist across all Animal
s, I will include an option for that as well (below).
- The models proposed by vijayp and DavidN (which seem to be identical) do not work because:
- They either
- do not allow for non-common properties to be stored (at least not for individual instances of any
Animal
), or
- require that all properties for all
race
s be stored in the Animal
entity which is a very flat (and nearly non-relational) way of representing this data. Yes, people do this all of the time, but it means having many NULL fields per row for the properties that are not meant for that particular race
AND knowing which fields per row are associated with the particular race
of that record.
- They do not allow for adding a
race
of Animal
in the future that does not have breed
as a property. And even if ALL Animal
s have a breed
, that wouldn't change the structure due to what has been previously noted about breed
: that breed
is dependent on the race
(i.e. breed
for Cat
is not the same thing as breed
for Dog
).
"Breed" as Common- / Shared- Property Approach
![enter image description here](https://i.stack.imgur.com/wNFTB.jpg)
Please note:
The SQL below can be run in the same database as the model presented above:
- The
Race
table is the same
- The
Breed
table is new
- The three
Animal
tables have been appended with a 2
- Even with
Breed
being a now common property, it does not seem right not to have Race
noted in the main/parent entity (even if it is technically relationally correct). So, both RaceID
and BreedID
are represented in Animal2
. In order to prevent a mismatch between the RaceID
noted in Animal2
and a BreedID
that is for a different RaceID
, I have added a FK on both RaceID, BreedID
that references a UNIQUE CONSTRAINT of those fields in the Breed
table. I usually despise pointing a FK to a UNIQUE CONSTRAINT, but here is one of the few valid reasons to do so. A UNIQUE CONSTRAINT is logically an "Alternate Key", which makes it valid for this use. Please also note that the Breed
table still has a PK on just BreedID
.
- The reason for not going with just a PK on the combined fields and no UNIQUE CONSTRAINT is that it would allow for the same
BreedID
to be repeated across different values of RaceID
.
- The reason for not switching which the PK and UNIQUE CONSTRAINT around is that this might not be the only usage of
BreedID
, so it should still be possible to reference a specific value of Breed
without having the RaceID
available.
- While the following model does work, it has two potential flaws regarding the concept of a shared
Breed
(and are why I prefer the Race
-specific Breed
tables).
- There is an implicit assumption that ALL values of
Breed
have the same properties. There is no easy way in this model to have disparate properties between Dog
"breeds" and Elephant
"breeds". However, there still is a way to do this, which is noted in the "Final Edit" section.
- There is no way to share a
Breed
across more than one race. I am not sure if that is desirable to do (or maybe not in the concept of animals but possibly in other situations that would be using this type of model), but it is not possible here.
CREATE TABLE Race
(
RaceID INT NOT NULL PRIMARY KEY,
RaceName VARCHAR(50) NOT NULL
);
CREATE TABLE Breed
(
BreedID INT NOT NULL PRIMARY KEY,
RaceID INT NOT NULL, -- FK to Race
BreedName VARCHAR(50)
);
ALTER TABLE Breed
ADD CONSTRAINT [UQ_Breed]
UNIQUE (RaceID, BreedID);
ALTER TABLE Breed
ADD CONSTRAINT [FK_Breed_Race]
FOREIGN KEY (RaceID)
REFERENCES Race (RaceID);
CREATE TABLE Animal2
(
AnimalID INT NOT NULL IDENTITY(1, 1) PRIMARY KEY,
RaceID INT NOT NULL, -- FK to Race, FK to Breed
BreedID INT NOT NULL, -- FK to Breed
Name VARCHAR(50)
-- other properties common to all "Animal" types
);
ALTER TABLE Animal2
ADD CONSTRAINT [FK_Animal2_Race]
FOREIGN KEY (RaceID)
REFERENCES Race (RaceID);
-- This FK points to the UNIQUE CONSTRAINT on Breed, _not_ to the PK!
ALTER TABLE Animal2
ADD CONSTRAINT [FK_Animal2_Breed]
FOREIGN KEY (RaceID, BreedID)
REFERENCES Breed (RaceID, BreedID);
CREATE TABLE AnimalCat2
(
AnimalID INT NOT NULL PRIMARY KEY, -- FK to Animal
HairColor VARCHAR(50) NOT NULL
);
ALTER TABLE AnimalCat2
ADD CONSTRAINT [FK_AnimalCat2_Animal2]
FOREIGN KEY (AnimalID)
REFERENCES Animal2 (AnimalID);
CREATE TABLE AnimalDog2
(
AnimalID INT NOT NULL PRIMARY KEY,
HairColor VARCHAR(50) NOT NULL
);
ALTER TABLE AnimalDog2
ADD CONSTRAINT [FK_AnimalDog2_Animal2]
FOREIGN KEY (AnimalID)
REFERENCES Animal2 (AnimalID);
Final Edit (hopefully ;-)
- Regarding the possibility (and then difficulty) of handling disparate properties between types of
Breed
, it is possible to employ the same subclass / inheritance concept but with Breed
as the main entity. In this setup the Breed
table would have the properties common to all types of Breed
(just like the Animal
table) and RaceID
would represent the type of Breed
(same as it does in the Animal
table). Then you would have subclass tables such as BreedCat
, BreedDog
, and so on. For smaller projects this might be considered "over-engineering", but it is being mentioned as an option for situations that would benefit from it.
For both approaches, it sometimes helps to create Views as short-cuts to the full entities. For example, consider:
CREATE VIEW Cats AS
SELECT an.AnimalID,
an.RaceID,
an.Name,
-- other "Animal" properties that are shared across "Race" types
cat.CatBreedID,
cat.HairColor
-- other "Cat"-specific properties as needed
FROM Animal an
INNER JOIN AnimalCat cat
ON cat.AnimalID = an.AnimalID
-- maybe add in JOIN(s) and field(s) for "Race" and/or "Breed"
- While not part of the logical entities, it is fairly common to have audit fields in the tables to at least get a sense of when the records are being inserted and updated. So in practical terms:
- A
CreatedDate
field would be added to the Animal
table. This field is not needed in any of the subclass tables (e.g. AnimalCat
) as the rows being inserted for both tables should be done at the same time within a transaction.
- A
LastModifiedDate
field would be added to the Animal
table and all subclass tables. This field gets updated only if that particular table is updated: if an update occurs in AnimalCat
but not in Animal
for a particular AnimalID
, then only the LastModifiedDate
field in AnimalCat
would be set.
It is great that you are taking the time to understand, classify and model the data you are dealing with since, from my personal experiencie, all this makes the whole development process easier and very flexible for future changes. And I am quite sure that you are also aware of this already.
Preliminary data model and assumed business rules
I defined a list of business rules that I have assumed after reading your question and examining closely your diagrams, in order to describe my understanging of your specifications. After defining such list, I derived an IDEF1X[1] data model that I decided to upload as a .PDF document in an external platform (Dropbox), since due to its format this data model does not fit well in an embedded image. These two instruments are going to be useful as references for some important points that I enumerate below in the section entitled Aspects to resolve in order to keep moving forward.
First, here is the…
Since it is only that, preliminary, consider it as an means helping us to acomplish the desired final data model.
Assumed business rules
Said preliminary data model was derived from a collection of business rules (inferred from your question) that I will enumerate as follows:
Organizations and profiles
Note that Profile
is currently understood as a synonym for Person
.
- An
Organization
is a friend of one-to-many Profiles
.
- An
Organization
is a friend of one-to-many Organizations
.
- An
Organization
is a member of one-to-many Organizations
.
- A
Profile
is a member of one-to-manyOrganizations
.
- A
Profile
is a friend of one-to-many Profiles
.
- A
Profile
is a member of one-to-many Profiles
.
Locations and addresses
An Organization
owns one-to-many Locations
.
A Location
is classified by one-to-many LocationTypes
(only one at a given point in time).
A Location
may have one-to-many Addresses
(one Physical
, one for Shipping
, one for Billing
, or one that serves all said purposes, or one that combines two purposes and another that serves only one of them).
An Address
may be kept by one-to-many Profiles
or, put another way, a Profile
keeps one-to-many Addresses
.
A specific Address
may be used by one-to-many Profiles
(serving as Physical
for one Profile
, being used for Billing
by a different one, etc.). So, an Address
works in a similar way for Locations
and Profiles
.
- Thus, an individual
Address
may be, at the same time, of type Physical
, Shipping
and Billing
.
Locations and roles
- A
Location
opens one-to-many Roles
.
- A
Role
may be carried out in one-to-many Locations
.
- A
Profile
(once it has been set as Member
of an Organization
) may carry out one-to-many Roles
, in one-to-many Locations
(but only one specific Role
in each Location
at a particular point in time, i.e., never two or more Roles
at the same time).
Aspects to resolve in order to keep moving forward
In order to keep advancing in the resolution of your data model, here is a list of relevant points that, once we work them out, are going to help us to reach this goal:
I have assumed that the term Profile
in your context has a similar (or the same) meaning as that of Person
, but it could be a bit different. In this way, would you say that, in your scenario, the entities Organization
and Person
are subtypes of Profile
?
Can a Profile
(or Person
) own one-to-many EmailAddresses
, or is a Profile
(or Person
) fixed to exactly one EmailAddress
?
Would you like to provide the possibility for an Organization
to be contacted via Telephone
and Email
, or you want to restrict that to be possible only for a Profile
(or Person
)?
I assume that a Location
is fixed to exactly one Address
of the type Physical
, is this correct?
Is it possible for a Location
to be shared by one-to-many different Organizations
or, otherwise, a Location
can be owned by only one Organization
?
You have stated via comments that the fact of being a Member
and a Friend
is the same. As you can see in my proposed preliminary data model, I followed you original specifications and depicted all the possible combinations of membership and friendship between Organization
and Profile
(or Person
) in different entities since I think that it can be helpful in the effort of defining the best possible structure for that part of your scenario. In this sense:
- I assume that the the statement
an Organization is a Member of another Organization
has different effects than the statement a Profile (or Person) is a Member of an Organization
regards the entity called Location
.
- As you can see in the data model, I think that the
Role
of Owner
is only valid for an Organization
and, to me, the valid Roles
for a Profile
(or Person
), inside a Location
are Admin
and Member
. What do you think about all this? Since you are in direct contact with the business rules that apply to your situation, you need to tell me if my assumptions are correct.
Can a Profile
(or Person
) play different Roles
inside the same Location
? i.e., can a Person
be, at the same time, the Admin
and also a Member
of the same Location
? What are the rules in this regard?
I think that the same Profile
(or Person
) can play different Roles
in different Locations
. For instance: A specific Profile
(or Person
) is the “Admin” in Location
“1”, and this same Profile
(or Person
) is a “Member
” in Location
“2”, at the same time. Am I right?
Is it possible for a particular Location
to have different LocationTypes
at the same time, or is an individual Location
fixed to hold exactly one LocationType
?
Does the attribute Organization.Website
represent the website address of a particular organization, such as “dba.stackexchange.com”?
If Profile
“1” (understood as Person
) is a Member
(or Friend
) of Profile
“2”, is it possible for Profile
“1” to carry out a Role
in a Location
owned by Profile
“2”? I consider that such scenarios are only valid for the relationships between an Organization
and a Member
Person
so, what do you think?
In a similar way, if Organization
“1” is a Member
(or Friend
) of Organization
“2”, is it possible for Organization
“1” to carry out a Role
in a Location
owned by Organization
“2”? Again, I think that this kind of scenarios are only valid for the relationships between an Organization
and a Member
Person
, is this correct?
In this regard —while I am writing this questions— I think that it would be reasonable to say that there are only three different kinds relationships involving Organizations
and Persons
, and we can define:
- (a) The relationship between an
Organization
and a Person
as “Membership
”.
- (b) The relationship between a
Person
and another different Person
as “Friendhip
”.
- (c) We have yet to find a meaningful name in order to describe the relationship between an individual
Organization
and another different Organization
.
- So, let me know what you think about (a), (b) and (c).
Is it possible for an Organization
to be a Friend
(or a Member
) of one-to-many different Organizations
at the same time? Or it is only possible for an Organization
to have only a relationship with exclusively one different Organization?
Successive data model depicting the first advance
In attention to your responses and resolutions to the pending aspects that I have listed above, I have created the following…
Although I do not feel quite comfortable with it yet, this new data model expresses the following business rules:
- A
Profile
is either an Organization
or a Person
.[2]
- A
Profile
may be the offering friend of one-to-many FriendProfiles
, and a Profile
may be the accepting friend of one-to-many FriendProfiles
.[3]
- A
Location
may consist of one-to-many Locations
.[4]
Answers to your subsequent specific comments
It's really interesting for me to note/compound the separation of concerns [e.g. LocationAddress and ProfileAddress] - for I obviously wanted to rush in and hold them all without the correct relations [funnily, It didn't feel right with my original ERD].
Yes, that is a good comparison, although I would not call it separation of concerns (which is, certainly, a fundamental principle in application programming and design), since this term commonly pertains to the application development stage and we currently find ourselves in the stage of understanding the data and designing its logical structure.
From my personal experience, I consider that this phase has to do with putting the significant things into their whole context, it has to do with seeing the associations that exist between the different entities that are of relevance in the particular scenario of interest, and then depicting these things in a data model. In the specific case on which you are commenting about, the Address
entity may have different kinds of connections with other entities, one with Profile
and a different one with Location
.
And, yes, when something does not feel right or natural, it may well be a sign that one needs to put more effort in order to understand the pertinent data. In this manner, the Address
entity is one of the things that I consider that needs more attention, since I think that the relationship between a Profile
and an Address
could be handled by means of the Location
entity (due to the fact that every Location
must have at least one physical Address
), therefore we could dismiss the ProfileAddress
associative entity depicted in the latest model, but you should continue analizing these points and let me know your ideas.
Also, is it common practice in IDEF1X to change PK/FK denotions in entities for better readability [e.g ProfileId - LocationOwnerProfileId]?
Yes, that is a very clever remark from you, since IDEF1X recommends the use of role names for denominating FOREIGN KEYS, in order to capture the meaning of such attributes in accordance to the entity in which it is being used. It is also worth noting that this is also strongly related to the concept of primary keys migration. As a matter of fact, the use of role names precedes IDEF1X, since it was originally presented by Dr. E. F. Codd in his 1970 seminal text. In this manner, one can clearly see the fidelity that the IDEF1X standard keeps towards the relational model.
I'd be intrigued to learn what you don't particularly like/feel it doesn't model, with/in the solution?
Besides the details already described above about the Address
entity, I am not sure if the Roles
carried out by a given Profile
in a particular Location
are equivalent for an Organization
or an Person
. From my perspective, a Person
first needs to be associated with an Organization
, and then this Organization
would appoint said Person
to perform a Role
in a particular Location
, but you know the scenario better, so this rules may be needless. In this regard, I am going to insist about the fact that it would be very helpful for me to know the contextual description or meaning that the future users of this data structure give to Organization
, Profile
, and Location
, but I understand that this may be considered confidential information, so this would be a limitation.
With the current structure, it seems like everyone (Organization
or Person
) can be related to anyone (again, Organization
or Person
) and can be/do anything (Role
) anywhere (Location
) but, perharps, this is preciselly what you and the users are expecting from this database, for which you will provide well defined constraints, of course. If this is the case, then we are almost providing a final solution. Since, naturally, your opinion is decisive in this situation, you should also analize this ideas and then let me know your conclusions so that we can take the final steps.
Feasible second advance
Unfortunately, the comunication stopped a few weeks ago, I guess because of work commitments that you must meet, which is completely reasonable. I would have been much more content if we had developed a more stable and robust model but, due to our previous interactions, I can assume that I have been able to point you in the right direction.
In addition to what has already been presented in this question and answer process, I consider that providing a new progression from the previous data models may be helpful for other seekers with a similar problem. So, I have created the…
Organizations and Profiles Preliminary Data Model — Second Advance
As can be seen in such data model, I have removed the many-to-many relationship that I have depicted in the preceding models between Profile
and Address
, since a given Profile
is already related to one-to-many Addresses
via its owned Locations
.
Another change that is illustrated in this new advance is the fact that it now includes the possibility that a given Location
can be owned by one-to-many Profiles
. Consequently, I have changed the Location
PRIMARY KEY (by dismissing the LocationOwnerProfileId
attribute) and then added an associative entity (many-to-many) that relates Profile
with Location
.
Notes
1. IDEF1X is a highly recommendable data modeling technique that was defined as a standard in december 1993 by the U.S. National Institute of Standards and Technology (NIST).
2. This is an ocurrence of a (super)type-subtype cluster. In case you are interested, here is an answer in which I deal in a more detailed manner with this kind of relationships.
3. An example of a many-to-many hierarchical relationship, and is very simliar to the structure that gave definitive solution to the “Parts Explotion Problem”. Such solution was, of course, introduced by Dr. Edgar Frank Codd in his 1970 enormously influential paper “A Relational Model of Data for Large Shared Data Banks”.
4. As such, this is an instance of a one-to-many (or many-to-one) hierarchical relationship.
Best Answer
"Can a subclass have 2 parents(2 entities that connect to it?)". Yes. However: Avoid redundant relationships in your ERDs. For your particular scenario: if the BOOKING entity is connected to STANDARD/EXECUTIVE via ROOM, then you will later be able to write a query whose output tells the user what type of room has been booked.
Regarding your questions: "For each entity connecting to this sublass, does it have its own instance of it? So: Will my room entity record if its a standard or executive room, and will the booking entity be able to record if that specific booking was to have decorations or not?" ANSWER: Yes, a ROOM can have attributes that determine whether it is a "standard" or "executive" (see the model below). A booking does not change a room's "decorations".![enter image description here](https://i.stack.imgur.com/izUMl.png)
It would be sufficient to model ROOM -< BOOKING (each room may have one or more bookings, same as you have done). As for the "room types", I would use an ARC ie an exclusive OR (XOR). The reasoning behind it: each room is either equipped as STANDARD room (with all the necessary attributes) or it has the EXECUTIVE "package" (including all attributes that you need).
There is NO need to draw (a) relationship(s) between STANDARD/EXECUTIVE and BOOKING. A ROOM gets booked (and each room is either a STANDARD or an EXECUTIVE, but this gets recorded in the respective attribute/column -> see the diagrams (ERD above, relational model below).
If it is possible that a single (unique) booking ID can be used to book several rooms, then you'd have a M:M relationship between ROOM and BOOKING. (Which would require an intersection.)
(As for the implementation of the XOR - which comes much much later: you can use a CHECK constraint in the ROOMS table in order to force the room to be either "standard" or "executive".)