Need help on the first ER diagram

database-designerd

I just started my first Database course online, and I have a homework assignment to create an entity-relationship diagram (ERD) from a list of specifications.

The specs are as follows:

  1. The company is organized into departments. Each department has a name, an unique number, and a particular employee who manages the department. We keep track of the start date when the employee began managing the dept. A dept may have several locations.

  2. A dept controls a number of projects, each of which has a name, unique number, and single location.

  3. We store each employee’s name, SSN, address, salary, sex, and DOB. An employee is assigned to one dept but may work on several projects, which are not necessary controlled by the same dept. We keep track of the number of hours per week that an employee works for each project. We also keep track of the direct supervisor of each employee.

  4. We want to keep track of the dependents of each employee for insurance purpose. We keep each dependent’s first name sex, DOB, and relationship to the employee.

E-R Diagram

Best Answer

For your first database class and first attempt at an ER Diagram (ERD) I think you have done a great job! I'd like to give you some feedback in the context of the process I use to break down a set of requirements like you were given and create a draft diagram. Hopefully by taking this approach I'll help you in your development of the skill of ER modeling and database design and not just give you an answer to a homework assignment.

Find the Entities and Relationships

The idea behind an ERD is to identify the entities - things of fundamental significance to the business - and how they are related. Thus the name Entity-Relationship Diagram. The goal at this point is not to design or implement a database but instead to develop a more structured understanding of the domain of interest on which a database will ultimately be based. A good way to get a start on finding the entities is to look for the nouns in the requirements and then pick out those which are persons, places, things, concepts, or events. Then, you can find most of the relationships by looking for the verbs that connect those nouns. Here is my first pass at finding entities and relationships in the requirements provided:

enter image description here

I highlighted the nouns in a bright green, and the verbs in a bright pink. Now notice there is one noun - supervisor - that is highlighted in a darker green. We'll ignore that for now. Comparing what I found to what you placed on your ERD we are on the same page! The only difference is that I also highlighted location. I can see why this was left off of your ERD as its not clearly an entity. You could just as easily think of it as an attribute - which is a common property or characteristic that all entities in an entity type share - of the department and the project and that is how you modeled it. The reason I elevated it to an entity type is this:

A dept may have several locations

This indicates that each department will be associated to one or more locations, and thus necessitates breaking this out to its own entity type. A second reason to elevate it is that we can think of a location as a place and as such would have its own attributes such as name, address, city, state, zip, and so on. A third reason to elevate it is that the location is referenced by more than one entity type - the department and the project. If you find the element in question is referenced by more than one already identified entity, you should consider it an entity in its own right that is related to the others, and not simply an attribute of the others. There is no science to this however - its purely a judgment call. This is why ER diagramming is at the conceptual level as its subjective to individual perspectives. One person's entity is another person's attribute depending on their perspective into the domain of interest.

Regarding the relationships, here is a list of what I highlighted in pink:

  1. Employee manages Department
  2. Department controls Project
  3. Employee assigned to Department
  4. Employee works on Project
  5. Employee has Dependent

Of these, you identified all but the first one. This, along with the supervisor I mentioned earlier, gets into a discussion of roles which we'll postpone to later. Right now we are just trying to identify the fundamentals and I'd say you were right on.

Associative Entities

Looking at the relationships we have a few that are many to many. The requirements state that an employee can work on many projects, which are not necessarily controlled by the same department. Given that most projects have more than one member, we can safely assume this is a many to many relationship. When you have such a relationship on an ERD, it is perfectly acceptable to show a many to many relationship line. I typically only use the many to many relationship though if I'm not showing attributes at all. Since we are showing them here, I prefer to resolve the many to many relationship with an associative entity. Even if there are no attributes for the entity now, we may uncover some as analysis proceeds. However, in the case of employees and projects, we have attributes - the hours worked - that must be added for that relationship and so it must be resolved by creating the associative entity to show them. Thus we will create an associate entity called Project Assignment. But we are still not done. The requirement calls for us to know the hours worked per week. When you think about it, you realize there would be many weeks in the life of the project, and we need to record the hours worked by that employee on that project each week. Thus another entity type is required which I called Hours Worked of which there will be many occurrences per project assignment. This entity type will hold the week ending date and the hours worked.

Next, by creating a location entity type, the requirement that states a department may have several locations means we now have a relationship where one department can have many locations. It is safe to assume as well that one location will have many departments operating there. Therefore I added another associative entity called Operating Site to show this.

Find the Attributes

The attributes were listed pretty clearly in the requirements with statements like:

  1. Each department has a name, an unique number,
  2. keep track of the start date when the employee began managing the dept
  3. projects, each of which has a name, unique number,
  4. store each employee’s name, SSN, address, salary, sex, and DOB.
  5. keep track of the number of hours per week that an employee works for each project.
  6. keep each dependent’s first name sex, DOB, and relationship to the employee.

What I do at this point is list out the entities and relationships identified and place the attributes identified with the entity type it belongs to:

  1. Department - Number, Name
  2. Employee - Name, SSN, Address, Salary, Sex, DOB
  3. Location - ???
  4. Project - Number, Name
  5. Dependent - First Name, Sex, DOB, Relation

Note I can't find any attributes for location. This would be a red flag to go back to the business to get some additional information about locations!

Then list the relationships and their attributes:

  1. Employee manages Department - Start Date
  2. Department controls Project -
  3. Employee assigned to Department -
  4. Employee works on Project - Hours Per Week
  5. Employee has Dependent -
  6. Department has Location -

For those last two relationships the verb has isn't very descriptive, but that was how the requirements were phrased. The entity has an attribute, so why wouldn't these be attributes? I discussed earlier why I would choose to make location an entity type. For dependent, the choice is more clear as its clear each employee has many dependents and so it must be broken out into its own entity. Once this is done we could make the leap to improve the verb. Perhaps something like employee cares for dependent and department operates at location.

Now we are in a position to create the diagram. Here is what I have: enter image description here

This was created using Oracle SQL Developer Data Modeler - a free download - which is a great tool if you want to continue the process into database design and creation.

Primary Keys

You have identified primary keys. Technically, an ERD does not require you even consider keys. If you think of the ERD as a tool to model the business, its not important to determine at this point exactly how you will uniquely identify an occurrence of each entity. Instead, you can assume that at a later point you will figure this out, and instead focus only on the entities, their relationships, and their attributes. It is a good idea to note which attributes are unique for each entity as this will help choose keys. Once you complete the diagram and iterate upon it with the business, you can go back and complete it by choosing the right keys. In my diagram I only noted that the two numbers mentioned as unique in the requirements were unique on the diagram using a U beside the attribute name.

Entity Roles

Now to address the dark green highlighted supervisor I mentioned earlier. The requirements stated that "a department has an employee who manages the department." They go on to state "We also keep track of the direct supervisor of each employee." So what we have here is an entity - a person - who is playing different roles. The employee might be a manager. The employee might be a supervisor. So how do we resolve this? First, we need to decide if the manager and the supervisor are the same thing! Is it the case that the manager of a department is also the supervisor of the employees assigned to it? If yes, the solution is simple. A new entity type called Assigned Manager can be created with a one to many relationship from department to it and from employee to it, with a single attribute of the start date when that employee started managing that department. Then, each employee's supervisor is assumed to the the manager of the department they are assigned to. If no, then we can add a recursive relationship from employee to employee to represent the the supervisor and the employee. This is the simplest approach but does introduce a mixing of roles in the same entity type. A better approach is to add a new entity type called Assigned Supervisor with two one to many relationships from employee to it - one to represent the supervisor and the other to represent the supervised. If there is a date when the assignment began and ended this could be added as well.

Conclusion

Hopefully this, coupled with the description of the process used to develop it, will be a good learning tool for you to understand how you get from a description of a business process to a draft ERD. I say draft because it is just a starting point. Once completed, it serves as a communication vehicle to validate the requirements, find holes in them, uncover new ones, and so on, before any programming or even system design begins. Remember that doing a great job building a system that implements requirements that were mis-understood will be a failure and it will be much cheaper to modify an ERD than a working system!

References

Two really good reference for ER diagramming are Steve Hoberman's Data Modeling Made Simple, which is a great overview, and David Hay's Enterprise Model Patterns, which gives a great in depth look at common patterns you find when analyzing organizations. Both of these references give much more detail on describing relationships using verb and prepositional phrases, identifying and non identifying relationships, and strong and weak entities - concepts I didn't address. Fabian Pascal's Practical Database Foundation Series is also excellent, and has a great first paper that is the perfect compliment to the books as Fabian describes the entire process of determining data requirements. Remember that ER diagrams can only show keys and references, whereas in fact there are many more kinds of business rules that can be explored and implemented in the ultimate database.