I’m having some trouble in recreating a database with a one-to-many (1:M) relationship between Users and Items.
This is pretty straightforward, yes; however, each Item belongs to a certain Category (e.g., a Car, a Boat or a Plane), and each Category has a particular number of attributes, e.g.:
Car
structure:
+----+--------------+--------------+
| PK | Attribute #1 | Attribute #2 |
+----+--------------+--------------+
Boat
structure:
+----+--------------+--------------+--------------+
| PK | Attribute #1 | Attribute #2 | Attribute #3 |
+----+--------------+--------------+--------------+
Plane
structure:
+----+--------------+--------------+--------------+--------------+
| PK | Attribute #1 | Attribute #2 | Attribute #3 | Attribute #4 |
+----+--------------+--------------+--------------+--------------+
Because of this diversity in the number of attributes (columns), I initially thought it would be a good idea to create one separate table for each Category, so I would avoid several NULLs and thus making a better use of the indexing.
Although it looked great at first, I couldn’t find a way to create the relationship between the Items and the Categories through the database because, at least in my modest experience as a database administrator, when creating Foreign Keys, I inform explicitly a database the table name and column.
In the end, I would like a solid structure to store all data, while having all the means to list all attributes of all Items a User may have with one query.
I could hardcode dynamic queries with the server-side language, but I feel this is wrong and not very optimal.
Additional information
These are my responses to MDCCL comments:
1. How many Item Categories of interest are there in your business context, three (i.e., Cars, Boats and Planes) or more?
In fact, it’s very simple: There are only five Categories in total.
2. Will the same Item always belong to the same User (that is, once a given Item has been “assigned” to a certain User it cannot be changed)?
No, they could change. In the fictional scenario of the question, it would be like User A sells Item #1 for User B, so the ownership must be reflected.
3. Are there attributes that are shared by some or all of the Categories?
Not shared but, from memory, I can tell that at least three attributes are present in all Categories.
4. Is there a chance that the cardinality of the relationship between User and Item is many-to-many (M:N) instead of one-to-many (1:M)? For example, in the case of following business rules:
A User owns zero-one-or-many Items
andAn Item is owned by one-to-many Users
No, because Items would describe a physical object. Users will have a virtual copy of them, each one identified by a unique GUID v4
5. Regarding your following response to one of the question comments:
“In the fictional scenario of the question, it would be like User A sells Item #1 for User B, so the ownership must be reflected.”
It seems that you are planning to keep track of the item ownership evolution, so to speak. In this way, which attributes would you like to store about such phenomenon? Only the modification of the attribute that indicates the specific User who is the Owner of a specific Item?
No, not really. The ownership may change, but I don’t need to keep track of the previous Owner.
Best Answer
According to your description of the business environment under consideration, there exists a supertype-subtype structure that encompasses Item —the supertype— and each of its Categories, i.e., Car, Boat and Plane (along with two more that were not made known) —the subtypes—.
I will detail below the method I would employ to manage said scenario.
Business rules
In order to start delineating the relevant conceptual schema, some of the most important business rules determined so far (restricting the analysis to the three disclosed Categories only, to keep things as brief as possible) can be formulated as follows:
Illustrative IDEF1X diagram
Figure 1 displays an IDEF1X1 diagram that I created to group the previous formulations along with other business rules that appear pertinent:
Supertype
On the one hand, Item, the supertype, presents the properties† or attributes that are common to all the Categories, i.e.,
Subtypes
On the other hand, the properties‡ that pertain to every particular Category, i.e.,
are shown in the corresponding subtype box.
Identifiers
Then, the Item.ItemId PRIMARY KEY (PK) has migrated3 to the subtypes with different role names, i.e.,
Mutually exclusive associations
As depicted, there is an association or relationship with a one-to-one (1:1) cardinality ratio between (a) each supertype occurrence and (b) its complementary subtype instance.
The exclusive subtype symbol portrays the fact that the subtypes are mutually exclusive, i.e., a concrete Item occurrence can be supplemented by a single subtype instance only: either one Car, or one Plane, or one Boat (never by zero or less, nor by two or more).
†, ‡ I employed classic placeholder names to entitle some of the entity type properties, as their actual denominations were not supplied in the question.
Expository logical-level layout
Consequently, in order to discuss an expository logical design, I derived the following SQL-DDL statements based on the IDEF1X diagram displayed and described above:
This has been tested in this db<>fiddle running on MySQL 8.0.
As demonstrated, the superentity type and each of the subentity types are represented by the corresponding base table.
The columns
CarId
,BoatId
andPlaneId
, constrained as the PKs of the appropriate tables, help in representing the conceptual-level one-to-one association by way of FK constraints§ that point to theItemId
column, which is constrained as the PK of theItem
table. This signifies that, in an actual “pair”, both the supertype and the subtype rows are identified by the same PK value; thus, it is more than opportune to mention that§ In order to prevent problems and errors concerning (particularly FOREIGN) KEY constraint definitions —situation you referred to in comments—, it is very important to take into account the existence-dependency that takes place among the different tables at hand, as exemplified in the declaration order of the tables in the expository DDL structure, which I supplied in this db<>fiddle too.
‖ E.g., appending an additional column with the AUTO_INCREMENT property to a table of a database built on MySQL.
Integrity and consistency considerations
It is critical to point out that, in your business environment, you have to (1) ensure that each “supertype” row is at all times complemented by its corresponding “subtype” counterpart, and, in turn, (2) guarantee that said “subtype” row is compatible with the value contained in the “discriminator” column of the “supertype” row.
It would be very elegant to enforce such circumstances in a declarative manner but, unfortunately, none of the major SQL platforms has provided the proper mechanisms to do so, as far as I know. Therefore, resorting to procedural code within ACID TRANSACTIONS it is quite convenient so that these conditions are always met in your database. Other option would be employing TRIGGERS, but they tend to make things untidy, so to speak.
Declaring useful views
Having a logical design like the one explained above, it would be very practical to create one or more views, i.e., derived tables that comprise columns that belong to two or more of the relevant base tables. In this way, you can, e.g., SELECT directly FROM those views without having to write all the JOINs every time you have to retrieve “combined” information.
Sample data
In this respect, let us say that the base tables are “populated” with the sample data shown below:
Then, an advantageous view is one that gathers columns from
Item
,Car
andUserProfile
:Naturally, a similar approach can be followed so that you can as well SELECT the “full”
Boat
andPlane
information straight FROM one single table (a derived one, in these cases).After that —if you do not mind about the presence of NULL marks in result sets— with the following VIEW definition, you can, e.g., “collect” columns from the tables
Item
,Car
,Boat
,Plane
andUserProfile
:The code of the views here shown is only illustrative. Of course, doing some testing exercises and modifications might help to accelerate the (physical) execution of the queries at hand. In addition, you might need to remove or add columns to said views as the business needs dictate.
The sample data and all the view definitions are incorporated into this db<>fiddle so that they can be observed “in action”.
Data manipulation: Application program(s) code and column aliases
The usage of application program(s) code (if that is what you mean by “server-side specific code”) and column aliases are other significant points that you brought up in the next comments:
It is opportune to indicate that while using application program code is a very fitting resource to handle the presentation or graphical features —i.e., the external level of representation of a computerized information system— of data sets, it is paramount that you avoid carrying out data retrieval on a row-by-row basis to prevent execution speed issues. The objective should be to “fetch” the pertinent data sets in toto by means of the robust data manipulation instruments provided by the (precisely) set engine of the SQL platform so that you can optimize the behaviour of your system.
Furthermore, utilizing aliases to rename one or more columns within a certain scope may appear stressing but, personally, I see such resource as a very powerful tool that helps to (i) contextualize and (ii) disambiguate the meaning and intention ascribed to the concerning columns; hence, this is an aspect that should be thoroughly pondered with respect to the manipulation of the data of interest.
Similar scenarios
You might as well find of help this series of posts and this group of posts which contain my take on two other cases that include supertype-subtype associations with mutually exclusive subtypes.
I have also proposed a solution for a business environment involving a supertype-subtype cluster where the subtypes are not mutually exclusive in this (newer) answer.
Endnotes
1 Integration Definition for Information Modeling (IDEF1X) is a highly recommendable data modeling technique that was established as a standard in December 1993 by the U.S. National Institute of Standards and Technology (NIST). It is solidly based on (a) some of the theoretical works authored by the sole originator of the relational model, i.e., Dr. E. F. Codd; on (b) the entity-relationship view, developed by Dr. P. P. Chen; and also on (c) the Logical Database Design Technique, created by Robert G. Brown.
2 In IDEF1X, a role name is a distinctive label assigned to a FK property (or attribute) in order to express the meaning that it holds within the scope of its respective entity type.
3 The IDEF1X standard defines key migration as “The modeling process of placing the primary key of a parent or generic entity in its child or category entity as a foreign key”.