How to Transfer Data from Denormalized to Normalized Tables in Oracle

migrationnormalizationoracle

Hi I currently have a database set up with a table like this:

Grades: student name|class name|overall|hw|test|quiz|project|val

where student name and class name are type varchar2 with the rest being type number. The categories overall,hw,test and quiz are expected to be 1 or 0 and val is the actual grade out of 100. So a sample row with a student getting a 95 on her test would look like this:

Grades: Sally | Biology | 0 | 0 | 1 | 0 | 0 | 95

This structure is clearly suboptimal, so I decided to normalize this database with help from this question here to :

Student: id|firstname | lastname (pk = id)

Class: id|name|room num (pk = id)

Categories: id|name|weight (pk = (id, name, weight))

Grades: student_id (fk)| class_id (fk)| category_id (fk) | name | grade_val 

ClassEnrollment: student_id (fk)| class_id (fk) | overall_grade (pk = student_id, class_id)

My main question is how to get the data from three tables to fill out the class enrollment table?

I can select all the overall grade using:

SELECT val
FROM grades_old
WHERE overall = 1;

And I can select the student id in a number of steps :

INSERT INTO temp(student, class)
SELECT student_name, class_name 
FROM grades_old;

SELECT temp.student_name, students.name, students.id
INNER JOIN students
ON temp.student_name = students.name

However, I'm not sure how to put all this together to get the overall grade, student_id and class_id in one query.

Edit:

I have manually filled out the Category database since there is only 5 categories in the system right now (overall, hw, test, quiz and project). I have a name column in the student table for now for purposes of matching, I figured I could worry about splitting it later then drop that column once I have split the names.

So I guess my focused question is for now how can I select the overall grade, student_id and class_id from the original grades table to insert it into the class Enrollment table?

Best Answer

General

In general, you need to fill out the table in hierarchical order (parent, child, grandchild, etc). The best way to do that is using SELECT DISTINCT.

This works very well if the text value (eg class_name) is UNIQUE. This way, you can JOIN your source data to get the parent Id ( class_id ) for your child tables.

In some cases, as the process gets more complicated, you may want to consider creating some staging tables. These are tables whose purpose in life is to hold partially processed data temporarily so that you can review it before going onto the next step.

Your Specific

First, I am assuming that one and only one of the Category columns will have a value of 1.

First thing you need to do is UNPIVOT all of your data into a staging table so that you can use @Akina's method.

with data as (
SELECT
  'Sally' AS first_name
 ,'Biology' AS class_name
 ,0 AS overall
 ,0 AS hw
 ,1 AS test
 ,0 AS quiz
 ,0 AS project
 ,95 AS val
FROM dual
)
select *
from data
UNPIVOT (
  is_used
  for Category in ( OVERALL,HW,TEST,QUIZ,PROJECT)
)
WHERE IS_USED = 1;

This will return a row like this

Name  | Class | Val | Category | IS_USED
Sally | Biology | 95 | TEST | 1

Now you can

  • SELECT DISTINCT student_name to get individual students for Student.
  • SELECT DISTINCT class_name to get individual Classes for each Class.
  • SELECT DISTINCT student_name, class_name to get everyone's ClassEnrollment
    • You'll need to join to Class on class_name to get class_id
    • You'll need to join to Student on student_name to get student_id
  • Finally, after you JOIN all the tables together, you can record everyone's Grades

This assumes that you have clean data. @Akina's method seems to assume your data is not clean (which is usually the case)