I am trying to design a database relational model for an assignment. First, I design the ER diagram, and then the relational model with tables (schemas).
Here's my question: Is it possible for an entity to have only one attribute?
database-agnosticdatabase-designerdrelational-theory
I am trying to design a database relational model for an assignment. First, I design the ER diagram, and then the relational model with tables (schemas).
Here's my question: Is it possible for an entity to have only one attribute?
What you can do is not always appropriate for the various usages of your relational model. If you were creating a data warehouse in order to analyze customer sales then derived attributes would be appropriate. I have done this for a summary table for a reporting tool. The query would have joined upwards of thirty tables with many aggregations such as sums and show all values an entity has had. A summary table, refreshed daily, listed derived attributes was a great solution for reporting.
For an online transaction processing database using derived attributes is not always the best solution.
For example: now your total is price * quantity
Next month management decides to implement a discount of 10% for customers who order more than $1000 in a calendar year. Your total column now looks inflexible.
Unfortunately the things you can say today that "will never change" such as Total = price * quantity are really an example of business logic. Business logic can change anytime in unexpected ways.
To continue with your example....if management institutes a discount and you have a customers table, orders table then all you have to do is add a discount table. Then you can create a view which encapsulates the business logic of the day to derive total sales. When the logic changes you can change the view much easier than recalculating the derived attributes that are fixed in a table.
And if you really want to be prepared you could store the changes in the business logic in a table in the database and cover off "Who, What, why". So if the Bob the Manager offers a discount and five years later Sue, the new manager, says "When did we start offering discounts and who authorized it?" you are a database star.
The proper structure for this scenario is a SubClass / Inheritance model, and is nearly identical to the concept I proposed in this answer: Heterogeneous ordered list of value.
The model proposed in this question is actually quite close in that the Animal
entity contains the type (i.e. race
) and the properties that are common across all types. However, there are two minor changes that are needed:
Remove the Cat_ID and Dog_ID fields from their respective entities:
The key concept here is that everything is an Animal
, regardless of race
: Cat
, Dog
, Elephant
, and so on. Given that starting point, any particular race
of Animal
doesn't truly need a separate identifier since:
Animal_ID
is uniqueCat
, Dog
, and any additional race
entities added in the future do not, by themselves, fully represent any particular Animal
; they only have meaning when used in combination with the information contained in the parent entity, Animal
.Hence, the Animal_ID
property in the Cat
, Dog
, etc entities is both the PK and the FK back to the Animal
entity.
Differentiate between types of breed
:
Just because two properties share the same name does not necessarily mean that those properties are the same, even if the name being the same implies such a relationship. In this case, what you really have is actually CatBreed
and DogBreed
as seperate "types"
VARCHAR
but if you need to store anything outside of the standard ASCII set, you should really use NVARCHAR
.Race
, CatBreed
, and DogBreed
) are not auto-incrementing (i.e. IDENTITY in terms of T-SQL) because they are application constants (i.e. they are part of the application) that are static lookup values in the database and are represented as enum
s in C# (or other languages). If values are added, they are added in controlled situations. I reserve the use of auto-increment fields for user data that comes in via the application.
This first set of tables are the lookup / types tables:
CREATE TABLE Race
(
RaceID INT NOT NULL PRIMARY KEY
RaceName VARCHAR(50) NOT NULL
);
CREATE TABLE CatBreed
(
CatBreedID INT NOT NULL PRIMARY KEY,
BreedName VARCHAR(50),
CatBreedAttribute1 INT,
CatBreedAttribute2 VARCHAR(10)
-- other "CatBreed"-specific properties as needed
);
CREATE TABLE DogBreed
(
DogBreedID INT NOT NULL PRIMARY KEY,
BreedName VARCHAR(50),
DogBreedAttribute1 TINYINT
-- other "DogBreed"-specific properties as needed
);
This second listing is the main "Animal" entity:
CREATE TABLE Animal
(
AnimalID INT NOT NULL IDENTITY(1, 1) PRIMARY KEY,
RaceID INT NOT NULL, -- FK to Race
Name VARCHAR(50)
-- other "Animal" properties that are shared across "Race" types
);
ALTER TABLE Animal
ADD CONSTRAINT [FK_Animal_Race]
FOREIGN KEY (RaceID)
REFERENCES Race (RaceID);
This third set of tables are the complimentary sub-class entities that complete the definition of each Race
of Animal
:
CREATE TABLE AnimalCat
(
AnimalID INT NOT NULL PRIMARY KEY, -- FK to Animal
CatBreedID INT NOT NULL, -- FK to CatBreed
HairColor VARCHAR(50) NOT NULL
-- other "Cat"-specific properties as needed
);
ALTER TABLE AnimalCat
ADD CONSTRAINT [FK_AnimalCat_CatBreed]
FOREIGN KEY (CatBreedID)
REFERENCES CatBreed (CatBreedID);
ALTER TABLE AnimalCat
ADD CONSTRAINT [FK_AnimalCat_Animal]
FOREIGN KEY (AnimalID)
REFERENCES Animal (AnimalID);
CREATE TABLE AnimalDog
(
AnimalID INT NOT NULL PRIMARY KEY, -- FK to Animal
DogBreedID INT NOT NULL, -- FK to DogBreed
HairColor VARCHAR(50) NOT NULL
-- other "Dog"-specific properties as needed
);
ALTER TABLE AnimalDog
ADD CONSTRAINT [FK_AnimalDog_DogBreed]
FOREIGN KEY (DogBreedID)
REFERENCES DogBreed (DogBreedID);
ALTER TABLE AnimalDog
ADD CONSTRAINT [FK_AnimalDog_Animal]
FOREIGN KEY (AnimalID)
REFERENCES Animal (AnimalID);
The model using a shared breed
type is shown after the "Additional Notes" section.
breed
seems to be a focal point for confusion. It was suggested by jcolebrand (in a comment on the question) that breed
is a property shared across the different race
s, and the other two answers have it integrated as such in their models. This is a mistake, however, because the values for breed
are not shared across the different values of race
. Yes, I am aware that the two other proposed models attempt to solve this issue by making race
a parent of breed
. While that technically solves the relationship issue, it doesn't help solve the overall modeling question of what to do about non-common properties, nor how to handle a race
that does not have a breed
. But, in the case that such a property were guaranteed to exist across all Animal
s, I will include an option for that as well (below).Animal
), orrace
s be stored in the Animal
entity which is a very flat (and nearly non-relational) way of representing this data. Yes, people do this all of the time, but it means having many NULL fields per row for the properties that are not meant for that particular race
AND knowing which fields per row are associated with the particular race
of that record.race
of Animal
in the future that does not have breed
as a property. And even if ALL Animal
s have a breed
, that wouldn't change the structure due to what has been previously noted about breed
: that breed
is dependent on the race
(i.e. breed
for Cat
is not the same thing as breed
for Dog
).
Please note:
The SQL below can be run in the same database as the model presented above:
Race
table is the sameBreed
table is newAnimal
tables have been appended with a 2
Breed
being a now common property, it does not seem right not to have Race
noted in the main/parent entity (even if it is technically relationally correct). So, both RaceID
and BreedID
are represented in Animal2
. In order to prevent a mismatch between the RaceID
noted in Animal2
and a BreedID
that is for a different RaceID
, I have added a FK on both RaceID, BreedID
that references a UNIQUE CONSTRAINT of those fields in the Breed
table. I usually despise pointing a FK to a UNIQUE CONSTRAINT, but here is one of the few valid reasons to do so. A UNIQUE CONSTRAINT is logically an "Alternate Key", which makes it valid for this use. Please also note that the Breed
table still has a PK on just BreedID
.
BreedID
to be repeated across different values of RaceID
.BreedID
, so it should still be possible to reference a specific value of Breed
without having the RaceID
available.Breed
(and are why I prefer the Race
-specific Breed
tables).
Breed
have the same properties. There is no easy way in this model to have disparate properties between Dog
"breeds" and Elephant
"breeds". However, there still is a way to do this, which is noted in the "Final Edit" section.Breed
across more than one race. I am not sure if that is desirable to do (or maybe not in the concept of animals but possibly in other situations that would be using this type of model), but it is not possible here.CREATE TABLE Race
(
RaceID INT NOT NULL PRIMARY KEY,
RaceName VARCHAR(50) NOT NULL
);
CREATE TABLE Breed
(
BreedID INT NOT NULL PRIMARY KEY,
RaceID INT NOT NULL, -- FK to Race
BreedName VARCHAR(50)
);
ALTER TABLE Breed
ADD CONSTRAINT [UQ_Breed]
UNIQUE (RaceID, BreedID);
ALTER TABLE Breed
ADD CONSTRAINT [FK_Breed_Race]
FOREIGN KEY (RaceID)
REFERENCES Race (RaceID);
CREATE TABLE Animal2
(
AnimalID INT NOT NULL IDENTITY(1, 1) PRIMARY KEY,
RaceID INT NOT NULL, -- FK to Race, FK to Breed
BreedID INT NOT NULL, -- FK to Breed
Name VARCHAR(50)
-- other properties common to all "Animal" types
);
ALTER TABLE Animal2
ADD CONSTRAINT [FK_Animal2_Race]
FOREIGN KEY (RaceID)
REFERENCES Race (RaceID);
-- This FK points to the UNIQUE CONSTRAINT on Breed, _not_ to the PK!
ALTER TABLE Animal2
ADD CONSTRAINT [FK_Animal2_Breed]
FOREIGN KEY (RaceID, BreedID)
REFERENCES Breed (RaceID, BreedID);
CREATE TABLE AnimalCat2
(
AnimalID INT NOT NULL PRIMARY KEY, -- FK to Animal
HairColor VARCHAR(50) NOT NULL
);
ALTER TABLE AnimalCat2
ADD CONSTRAINT [FK_AnimalCat2_Animal2]
FOREIGN KEY (AnimalID)
REFERENCES Animal2 (AnimalID);
CREATE TABLE AnimalDog2
(
AnimalID INT NOT NULL PRIMARY KEY,
HairColor VARCHAR(50) NOT NULL
);
ALTER TABLE AnimalDog2
ADD CONSTRAINT [FK_AnimalDog2_Animal2]
FOREIGN KEY (AnimalID)
REFERENCES Animal2 (AnimalID);
Breed
, it is possible to employ the same subclass / inheritance concept but with Breed
as the main entity. In this setup the Breed
table would have the properties common to all types of Breed
(just like the Animal
table) and RaceID
would represent the type of Breed
(same as it does in the Animal
table). Then you would have subclass tables such as BreedCat
, BreedDog
, and so on. For smaller projects this might be considered "over-engineering", but it is being mentioned as an option for situations that would benefit from it.For both approaches, it sometimes helps to create Views as short-cuts to the full entities. For example, consider:
CREATE VIEW Cats AS
SELECT an.AnimalID,
an.RaceID,
an.Name,
-- other "Animal" properties that are shared across "Race" types
cat.CatBreedID,
cat.HairColor
-- other "Cat"-specific properties as needed
FROM Animal an
INNER JOIN AnimalCat cat
ON cat.AnimalID = an.AnimalID
-- maybe add in JOIN(s) and field(s) for "Race" and/or "Breed"
CreatedDate
field would be added to the Animal
table. This field is not needed in any of the subclass tables (e.g. AnimalCat
) as the rows being inserted for both tables should be done at the same time within a transaction.LastModifiedDate
field would be added to the Animal
table and all subclass tables. This field gets updated only if that particular table is updated: if an update occurs in AnimalCat
but not in Animal
for a particular AnimalID
, then only the LastModifiedDate
field in AnimalCat
would be set.
Best Answer
Yes, but.
If an entity only has one attribute, then that attribute must serve as an identifier for the entity. Typically, if only one attribute is found for an entity in the given data stream, then another attribute is "invented" for purposes of identifying. It's often called "id", it's often an integer, and it's often populated using the autonumber feature of the database system.
When you are doing analysis, it's important to distinguish between the features that were given to you in the problem statement, and the features you added in order to make sense out of the problem.