You are thinking along the right track in wanting to store information about each person. Since you want to record a history of their statistics, the best way to do that is to use another table which will keep one row for each time your record statistics with the date when they were recorded. Now the question becomes how do you want to record each statistic? You have two basic options.
Column Wise
If you have a fairly fixed set and want to make make it very simple, you can place each statistic by name as a column on Measurement along with the Date. So for example, you'd have a column for Weight, Steps Walked, and so on. The advantage of this approach is it is simple and clear, and you can use a data type specific to each type of statistic. The disadvantage to this approach is that to add new kinds of statistics you have to add new columns, and you have to make all of the statistic types optional should you not record one or more of them for a given Measurement.
Row Wise
A second option would be to make the statistic more generalized by having a Statistic Type table. There would be a row in Statistic Type for each kind of statistic you want to record. Then, you would associate Measurement to the Statistic Type and record the Value. The advantage of this approach is that you can easily add more statistics, and you only have to record just the values for the statistic types you measured. The disadvantage is that this is more abstract and complex, and you have to use a generalized data type that can support all of the various units of measure, or you have to create a mutually exclusive set of columns each of a data type matching a statistic type. If you go with the first approach, you can really turn it into a science project in order to support all the various data types and still ensure data integrity.
Other Points
The diagrams I show are created using Oracle's Data Modeler tool which is a free download and very powerful. You can create DDL directly from it. One thing you want to consider that I didn't discuss is that, if you use surrogate keys as I did here, you want to also define an alternate unique key for each table using more natural columns. In the case of Person that could be Name. This way, you won't accidentally add the same person twice if your list gets long. A second thing is you have to think about if you have one value per statistic or more than one. For example, Weight and Steps are good examples of a single statistic for the Measurement. Today you record that I weight 200 pounds and walked 5000 steps. But for something like Bad Foods Eaten there could be more than one. If that is the case you have what we call a repeating group. To resolve this, you either have to create a fixed number of columns - one for each bad food - or create a new table that will be a child of the Measurement that will have one row for each bad food.
Examples
Here is an example of the way your data would look using the column approach (skipping the surrogate keys for simplicity of display):
And here is an example of the way your data would look using the row approach:
I hope this helps! A great book on data modeling like this is Steve Hoberman's Data Modeling Made Simple.
IMHO you just have to pay attention on the order of cardinalities, and you should note that Issue Date is not a weak key, but it's not a key at all: in fact you could have two account statements of different people issued on the same date, so you need a foreign key (Issue Date plus Number of Account)
Best Answer
The part of the scenario that you are confused with can be modeled with a classic construct called supertype-subtype1 structure.
I will (1) introduce some pertinent preliminary ideas, (2) detail how I would delineate —at the conceptual level— the business context under consideration, and (3) provide additional related material —e.g., the corresponding logical-level representation via SQL-DDL declarations— as follows.
Introduction
A structure of this nature takes place when, in a given business environment, there is a cluster of entity types within which the supertype has one or more properties (or attributes) that are shared by the rest of the entity types in the cluster, i.e., the subtypes. Every subtype has, in turn, a particular set of properties that are applicable to itself only.
Supertype-subtype clusters can be of two kinds:
Exclusive. Comes about when an instance of the superentity type must always have one and only one subtype counterpart; therefore, the potential subtype occurrences in question are mutually exclusive. This is the kind that concerns to your scenario.
A typical case in which an exclusive supertype-subtype comes about is a business domain where both an Organization and a Person are considered Legal Parties, like in the situation deliberated in this series of posts.
Nonexclusive. Presents itself when a supertype instance may be complemented by multiple subtype occurrences, each of which is compelled to be of a different category.
An example of this kind of supertype-subtype is dealt with in these posts.
Notes: It is worth mentioning that supertype-subtype structures —being elements of a conceptual character— do not belong to a specific data management theoretical framework, be it relational, network or hierarchical —each of which offers particular structures to represent conceptual elements—.
It is also opportune to point out that although supertype-subtype clusters bear a certain resemblance to object-oriented application programming (OOP) inheritance and polymorphism, they are in fact distinct devices because they serve different purposes. In a database conceptual model —that must represent real world aspects— one deals with structural features in order to describe informational requirements, whereas in OOP polymorphism and inheritance, among other things, one (a) sketches and (b) implements computational and behavioural characteristics, aspects that decidedly belong to appplication program design and programming.
Apart from that, an individual OOP class —being an application program component—, does not necessarily have to “mirror” the structure of an individual entity type that belongs to the conceptual level of the database at hand. In this respect, an application programmer may typically create, e.g., one single class that “combines” all the properties of two (or more) different conceptual-level entity types, and such a class may as well include computed properties.
Using entity-relationship constructs to represent a conceptual model with supertype-subtype structures
You asked for an entity-relationship diagram (ERD for brevity) but, although being an extraordinary modeling platform, the original method —as introduced by Dr. Peter Pin-Shan Chen2— did not supply enough constructs to represent scenarios of the sort being discussed with the precision that a proper database conceptual model requires.
Consequently, it was necessary to make some extensions to said method, situation that yielded results in the development of an approach that assists in the creation of enhanced entity-relationship diagrams (EERDs) that, naturally, enriched the initial diagramming technique with new expressive characteristics. One of those characteristics is, precisely, the possibility of depicting supertype-subtype structures.
Modeling your context of interest
The illustration shown in Figure 1 is an EERD (using symbols similar to the ones proposed by Ramez A. Elmasri and Shamkant B. Navathe3, who refer to such structures as superclass/subclass) where I modeled the business domain you describe considering all the specifications. It is also available as a PDF that can be downloaded from Dropbox.
As you can see in the aforementioned diagram, both
Group
andSoloPerformer
are displayed as exclusive subtypes of theArtist
superentity type:Diagram description
In order to start the description of the EERD, it is important to point out that your sentence
is related to the disjointness and the completeness aspects of the supertype-subtype cluster at hand.
Disjointness
The disjointness feature is particularly important because it is right here where the “or part” that you mentioned comes into play, due to the fact that an
Artist
has to be either one subtype instance or the other, which I specified in the EERD through the small circle containing the letter “d”, a construct that receives the name of disjoint rule.When a supertype may be supplemented by one or more of its possible subtypes, this point must be expressed by a small circle holding a label with the letter “o”, a symbol called overlap rule.
Discriminator property
Also within the scope of the disjointness factor of this supertype-subtype association, it is worth paying close attention to the
Artist.Type
property, since it carries out a very relevant task in this arrangement: it functions as the subtype discriminator. It is named in this way as it is the property that points out the exclusive kind of subtype with which a specific instance of anArtist
relates to.In the cases of nonexclusive clusters, the use of a discriminator property is needless, for a certain supertype can have multiple subtypes as complements (as brought up above).
Total specialization rule and completness
The requirement that stipulates that every
Artist
must always have a supplementary subtype instance has to do with the completness characteristic of this cluster. This is delineated by means of a total specialization rule, demonstrated via the double-line symbol connecting (a) theArtist
supertype with (b) the disjoint rule construct.Relating Groups with Solo Performers
Evaluating the sentences
and
one can recognize that both subtypes are involved in a many-to-many (M:N) association (or relationship), which I represented with the diamond-shaped box labeled as
Group-SoloPerformer
.If implemented in a relational database as a base table, this component would be very useful to derive (i.e., to carry out the calculation of) the total
Number
ofSoloPerformers
that make up a concreteGroup
(one of the requirements that you specified).The association between Solo Performers and Instruments
The stipulation
permits us inferring that, at the same time,
Thus, this is another example of a M:N association, and I used the diamond-shaped figure designated
SoloPerformer-Instrument
to expose it.Additional material
In order to expound the scope of supertype-subtype structures I am going to include two more resources, i.e.,
an IDEF1X4 diagram presented in Figure 2 (and you can download it from Dropbox as a PDF, as well) that illustrates the expressive capabilities of this sort of diagrams regarding the business domain at issue; and
the respective expository DDL logical structure that exemplifies how to manage the full scenario under discussion by virtue of a SQL database management system.
1. IDEF1X representation
The IDEF1X information modeling technique certainly offers the capability of portraying supertype-subtype structures, though with a limitation: it does not lend a visual mechanism to indicate whether a precise cluster is of an exclusive or noneclusive kind (its “native” symbols can only communicate the complete or incomplete identification of all the possible subentity types of significance). Fortunately, one can employ Information Engineering (IE) notation to show this paramount aspect more accurately while taking advantage of the descriptive power of the IDEF1X standard.
In this technique, the main feature of your question is denominated “categorization relationship”, where a supertype is referred to as “generic entity” and a subtype receives the name of “category entity”. However, I will continue applying the term supertype-subtype in this post because (1) it was used by Dr. Edgar Frank Codd, the originator of the relational model, (2) it is more widely known and (3) IE notation is used instead of the “native” one.
Foreign keys and supertype-subtype clusters
As demonstrated, IDEF1X provides a further advantage: the means to exhibit FOREIGN KEY (FK) definitions, elements of prime importance if a practitioner is going to represent a supertype-subtype association in a relational database.
In order to portray such a sort of association, the PRIMARY KEY (PK) property of the supertype, i.e.
Artist.ArtistNumber
, has to migrate toGroup
andSoloPerformer
, although it has been assigned two different role names5, 6,GroupNumber
andSoloPerformerNumber
respectively, for the purpose of emphasizing the meaning conveyed by the property in the context of each subentity type.Apart from being characterized as PKs, the
Group.GroupNumber
andSoloPerformer.SoloPerformerNumber
properties are, at the same time, depicted as FOREIGN KEYs (FKs) that make a reference toArtist.ArtistNumber
, the supertype PK property.So, since every
SoloPerformer
andGroup
occurrence is existence-dependent on an exactArtist
instance, these entity types are involved in an identifying association that takes effect by way of the PK property migration process delineated in the preceding paragraphs.Foreign keys and associative entity types
The IDEF1X diagram serves as well to illustrate the FKs that compose the PKs of the two associative entity types of relevance, i.e.,
GroupMember
andSoloPerformerInstrument
; the first one connects the two subtypes, and the second one links a subtype with an independent entity type, i.e.,Instrument
.2. Expository SQL-DDL logical declarations
As explained before, a supertype-subtype structure is a means to express certain kinds of business-domain-specific conceptualizations regarding informational requirements, which can in turn be represented in a database by means of distinct constructs that must correspond to those offered by the particular theoretical paradigm (be it relational, network or hierarchical) followed by the database management system being utilized by the designer.
One of the multiple advantages of the relational paradigm is that it permits representing the information in its natural structure, and the most popular approximations to the systems proposed in the relational theory are the various SQL database management systems.
So, finally, here are some sample DDL statements —including (a) base tables schemas along with (b) some of the pertinent constraints— that represent, at the logical level of abstraction, the conceptual modeling exercise treated above:
Data integrity and consistency considerations
In agreement with all that has been previously explained, the designer must guarantee that each “supertype” row is at all times complemented by its accompanying “subtype” counterpart and, in turn, make sure that said “subtype” row is compatible with the value contained in the supertype “discriminator” column.
It would be very practical and elegant to enforce said circumstances declaratively (as the relational framework proposes) but, alas, none of the major SQL platforms has provided the suitable mechanisms to do so (as far as I know). Therefore, it is highly convenient to employ ACID TRANSACTIONS so that these conditions are always met in a database (other option would be to make use of TRIGGERS, but they tend to make things untidy).
Data derivation considerations
One of the main aspects of the relational model is that it considers data derivation as a paramount factor in data management. In accordance, it facilitates creating (a) base relations —or base tables in SQL, as shown in the DDL statements above— and (b) derived relations —derived tables in SQL, i.e., those declared by dint of SELECT operations that may be fixed as views for further exploitation—.
So, one can declare a view that gathers the “full” Group data points:
And other view that combines the “full” SoloPerformer pieces of information:
In this manner, it is very easy to manipulate —declaratively— all the significant data via the very same logical-level device, i.e., the relation or table (be it base or derived). Evidently, the usage of views is more effective when the conceptual entity types represented in a relational database possess more properties of interest, but it is a possibility worth illustrating with the present scenario.
References
1 Codd, E. F. (Dec. 1979). Extending the Database Relational Model to Capture More Meaning, ACM Transactions on Database Systems, Volume 4 Issue 4 (pp. 397-434). New York, NY, USA.
2 Chen, P. P. (March 1976). The Entity-Relationship Model—Toward a Unified View of Data, ACM Transactions on Database Systems - Special issue: Papers from the International Conference on Very Large Data Bases: September 22–24, 1975, Framingham, MA, Volume 1 Issue 1 (pp. 9-36). New York, NY, USA.
3 Elmasri, R & Navathe, S. B. (2003). Fundamentals of Database Systems, Fourth Edition. Addison-Wesley Longman Publishing Co., Inc. Boston, MA, USA.
4 National Institute of Standards and Technology (U.S.) [NIST] (Dec. 1993). Integration Definition for Information Modeling (IDEF1X), Federal Information Processing Standards Publication, Volume 184. USA.
5 Codd, E. F. (June 1970). A Relational Model of Data for Large Shared Data Banks, Communications of the ACM, Volume 13 Issue 6 (pp. 377-387). New York, NY, USA.
6 See reference 4