Mysql – Peer review database design for various reports

database-designinnodbMySQLPHP

I'm currently trying to redesign a poorly thought out database for a peer review system that we use internally at my workplace. Employees can log on and submit a review about another employee who worked on the same team for a specified project.

I was initially tasked to just remove one of the questions (from 4 to 3) but they are stored as columns in a review table instead of as their own table. I'm attempting to redesign the database with possible additions of questions in mind, and here's what I've got so far:

---------------
| category    | // the questions you are asked to rate an employee by
---------------
| id (PK)     |
| name        |
| description |
---------------

--------------------
| score            | // the score you give to a category
--------------------
| id (PK)          |
| score            | // ENUM (1,2,3)
| category_id (FK) | // category.id
| review_id (FK)   | // review.id
--------------------

--------------------
| review           |
--------------------
| id (PK)          |
| date             |
| for_month        |
| reviewee_id (FK) | // employee.id
| reviewer_id (FK) | // employee.id
| project_id (FK)  | // project.id
| comments         |
--------------------

--------------
| department |
--------------
| id (PK)    |
| name       |
--------------

----------------------
| employee           |
----------------------
| id (PK)            |
| first_name         |
| last_name          |
| username           |
| department_id (FK) | // department.id
| start_date         |
| end_date           |
| is_admin           |
----------------------

------------
| project  |
------------
| id (PK)  |
| name     |
| end_date |
------------

--------------------
| project_employee | // maps employees to projects
--------------------
| project_id (FK)  | // project.id
| employee_id (FK) | // employee.id
--------------------

It seems like I'm going in the right direction, at least for storing reviews. However, retrieving data about reviews for managers feels overly complicated, involving lots of joins. I want to provide multiple views of the data, such as:

  • List of employees with year average scores
  • Reviews for individual employees
  • List of departments with year average scores
  • Reviews for individual departments
  • List of projects with average scores
  • Reviews for individual projects
  • Reviews by month

Things like showing average yearly scores for a department get fairly convoluted, since there is no direct link between scores and departments, or even reviews and departments.

Would it be overkill to add foreign keys to the score table for project, reviewer, reviewee, and department? That way I can very directly query the score table to get scores for departments. Or maybe I should keep what's in review and just add department? Or should I suck it up and keep writing long queries that join multiple tables from score to review to employee to department to get that data? Or am I just doing this wrong?

EDIT: After thinking about it some more, I'm leaning towards adding department_id to the review table, since that should give me everything I need to lookup reviews and grab scores. It also solves another unrelated issue, that of what to do when an employee changes departments, since the original reviews should still count towards the department they were in at the time. My question of if I'm doing this right in general still stands…

Best Answer

I think you are doing well going for the 4th normal form and exploding your data. Now, as you found out, you need to de-normalise to support for historical data support: if the review will always account for a particular department's score whatever the current department of the reviewee_id is, then yes, you definitely need to add a department at the review table and make sure you know what/why it is there for.

Push the logic further: what if there is a 4th and 5th category added in a year? How do you calculate past and future averages then? On 3 or 5 categories? Then another year later, the first category gets removed? Will you delete all scores for that category, or change yet again the average's calculation based on the number of "category" at the time of the "date" of the "score"?

Sometime, too much normalisation is not the best course of action. You have a good core here. Time to custom-fit now. Take a step back, plan for the future, ask the end-users to make sure you support all their foreseeable needs, and look at the impacts of adding/removing/updating departments, employees and categories on historical reporting.

As for reporting, if you are using powerful data tools like IBM Cognos, it will be easy to generate data cubes, then let managers do their own reports. If you are using tools like Microsoft Excel with "external data", then the easiest course of action would be to pre-create Views or Stored Procedures that crunch up the data. The caveat here is that you will have to create a new View/SP for every manager that comes up with a new request/idea/variant.