I'm running for office and have created a web app for tracking my door knocks to voters at their homes. The database contains a table called voters
that contains all the necessary information about voters in my community. I'm using mysql.
I'd like to add a new feature to track donors to my campaign. Not all of these donors live in the community and do not vote in the district where I'm running for office. I do not need to track the same kind of info for these individuals as I do for the voters so I'm going to place these individuals into a table called nonvoters.
Now, the individuals in my voters
table can also make donations, and I want to track those as well.
Finally, I'd also like to track donations from organizations, so I also need an organizations
table for storing information about the organizations.
To track the donations from both voters, nonvoters and organizations, I want to set up a new table called donations.
This table would contain the appropriate details about each donation.
But I'm uncertain as to what the best structure is for linking the donations
table to the voters
, nonvoters
, and organizations
table should be. If I create a column called donor_id
in the table to key it to the donors information, there's no way to know which table that ID refers to. So do I set up three columns, nonvoter_id
, voter_id
and org_id
and insert the ID into the applicable column depending on whether the donor is a voter? This seems weird.
Or maybe I create a new column in each of the three tables called donor_id
that I can use to link my data to the donations table. If I went this route, it seems like I'd have to do a some behind the scenes work to ensure the donor_id was unique and was keyed to the data inside the donations table.
Or maybe there are other approaches I'm not familiar with. Any guidance is appreciated.
Best Answer
Good question, you have provided a quite engaging context and analysis.
According to my understanding of your scenario description, some important aspects can be handled by means of (Super)Type-Subtype relationships. In this regard, this post provides relevant information.
Assumed business rules
As per your current specifications, the following assertions appear valid:
Party
[1] is either aPerson
or anOrganization
.Party
grants zero-one-or-manyDonations
.Person
can be aVoter
and aNonVoter
.Preliminary logical data model
Therefore, I have derived the IDEF1X[2] logical data model presented in Figure 1.
Party and Donation
As you can see, the incorporation of the
Party
entity as an exclusive supertype provides the possibility for aPerson
or anOrganization
to be able to grantDonations
, since thePartyId
attribute migrates[3] fromParty
toDonation
, but I have assigned a role name[4] (i.e.,DonorId
) to such attribute in order to make it more meaningful in the context of theDonation
entity.Person and Organization
In the same way,
PartyId
migrates toPerson
asPersonId
, and toOrganization
asOrganizationId
, and these two entities, in turn, can hold their specific attributes.Voter and NonVoter
I consider that, in your business context, a given
Person
can be aVoter
and aNonVoter
over time, depending on the particularDistrict
in which saidPerson
is located, thus I have temporarily depicted this circumstances in a nonexclusive supertype cluster, but I consider that we need to clarify some points in this respect.Aspects that require clarification
The goal of this data modeling excersise is to capture the things of interest with respect to your business context requirements, and one can serve that purpose by determining the entities involved along with their respective attributes (including, with especial attention, the ones that uniquely identify each entity ocurrence) and the way in which such entities are interconected. In this manner, the semantic value that said aspects hold for you, and your data users, is paramount. The better these factors are comprehended, the better they can be depicted in a data model.
Having said that, I have created three more models (also preliminary) in order to have a vehicule to exemplify some ideas about your scenario. They are contained in this PDF, which you can download from Dropbox.
Then, here are the concrete aspects that need clarification:
DoorKnock
?Person
may receive the status ofVoter
orNonVoter
in the context of —as mentioned above— (a) theDistrict
in which suchPerson
resides and (b) a determinedElection
instance. What do you think about it?Person
?Voter
?NonVoter
?Person
entity.Donation
can be granted only in the context of anElection
occurrence, is this correct?Donations
granted byOrganizations
, (a) are organizations that reside outside your election district allowed to make donations, or (b) is this only permitted for organizations located in your specific district?Organizations
, it would be necessary to move theDistrict
PK fromPerson
toParty
. I am thinking that, actually, that could be the right place for this attribute, but let me know your ideas about it. I have illustrated this option in Model C.Donations
are allowed by eachParty
? (a) One, (b) a fixed number, or (c) an undefined quantity?MinimumAmount
and (b) aMaximumAmount
per individualDonation
?MinimumAmount
and (b) aMaximumAmount
per the total set ofDonations
granted by a certainParty
?Election
instance?District
occurrence?DistrictNumber
for this purpose in all the models, but it may well be a different attribute.Response to your comments
Yes —as far as I know— Type-Subtype relationships were introduced, as such, in the relational paradigm back in 1979[5] and, unfortunately, they are not widely used in common implementations. But now that we know them, we can take advantage of this interesting and powerful structure in your scenario.
Do not worry, it is completely understandable that you are familiar with other computer science branches but not that much with relational databases, and I consider that this is a good oportunity for you to know more about this fascinating discipline, we all learn new things every day.
Yes, INSERTing a row (or set of rows) in a database table implies different considerations than, say, creating a record or file in a programming language.
We need first to focus in obtaining a stable logical data model, then we will easily derive a physical model (DDL statements or “CREATE Table code”). Once we have a clear target (all the aforementioned) defined, I will be glad to offer some ideas and guidance in relation to your INSERTs implementation and other pertaining elements.
Notes
1. Party is a term used in legal contexts to refer to a single person or group of persons that compose an individual entity.
2. Integration Definition for Information Modeling (IDEF1X) is a highly recommendable data modeling technique that was defined as a standard in december 1993 by the United States National Institute of Standards and Technology (NIST).
3. IDEF1X defines key migration as “The modeling process of placing the primary key of a parent or generic entity (or supertype) in its child or category entity (or subtype) as a foreign key”.
4. The use of role names is recommended since 1970 by Dr. E. F. Codd in his seminal paper entitled “A Relational Model of Data for Large Shared Data Banks”. For its part, IDEF1X —keeping fidelity regards relational practices— also advocates role naming.
5. By Dr. Codd (as one would expect) in his work entitled “Extending the Database Relational Model to Capture More Meaning”.