One to one mapping vs many to one mapping on data base design

database-design

I'm currently doing call detail records (CDR) data analysis as my project. On course of data-base design, I had separated the date (table contain id, day,month,year), time field (table contain id, second, minute, hour etc) from call table (table contain called number, duration, time_id, date_id etc) as millions of call transaction occurs in a day. Now I got confuse on how to map date, time to call table i.e. do one-to-one mapping or many-to-one mapping. One to one mapping can be done easily but data are redundant but I have no idea about many-to-one mapping. What I think about many to one mapping is I have first extract date and check if it exist or not. if exit then get its id otherwise insert date and get its id. (Background:- I have a cdr data with field card_no, service_key, calling_no, called_no, answer_time, clear_time, duration on text file on tab separated format and have to make data mart as like figure below.
enter image description here

I am not good on telling problem so if any information required then comment below.

EDIT

On cdr table

  1. Card_no is mobile number of the subscriber and if the subscriber dialing to the other number than if is equal to calling and if subscriber is receiving the call then card_no is equal to called. In call table its repeating.
  2. sql_time_stamp means answer_time and as duration is already in database then why I have to store clear_time also ( if there is any reason then let me know)

On Subscriber table

Subscriber table contain dob which is static so I create another table as this contain age (which may be changing) just to separate static part and dynamic part also gender is include to reduce multiple table joining as analysis just contain age(_group) and gender.

Best Answer

Some observations on this data model based on some assumptions of your intent. Your cdr data is a tab-delimited text file. The cdr data becomes the central transaction table. A transaction table has a large number of rows that change often. It looks like there are other lookup tables surrounding the call table. A lookup table has a small number of records that do not change often. I think it is good that you have a separate field, a unique integer, as the key to each table. Integer keys are the best for indexing.

Calls

  • Create an autogenerated key on this table similar to the others. Then put a unique index on [calling party key], [date key], [time of day key]. It is important to avoid using dates and times as keys.
  • Add [card no] field. Is this a unique identifier of a cdr record? If so, use it as the new primary key.
  • Remove foreign key to subscriber demographics.
  • Add a [call date] field.
  • Remove the foreign key to Date table.
  • Add a [holiday indicator] field. One character.
  • Does the [SQL Date-Time Stamp] mean [answer time]?
  • Add a [time interval] field.
  • Remove the foreign key to Time of Day.
  • Add [peak time indicator] field. One character.
  • Add [call length classifier] field.
  • Remove foreign key to Derived Call information.

Subscriber
The [subscriber key] and [subscriber id] look like either could be a key. Do you need both?

Subscriber Demographics
It looks like a subscriber demographic record depends only on the subscriber, not on the call detail.

  • Move the relationship from subscriber demographics to subscriber instead.
  • Add foreign key to Subscriber.

Date
It is unusual to store all possible dates in a table this way. The individual components of a date (day, month, year, day of week, and weekend) can easily be computed. Generate them in a query. Maybe save the query or view. Do not store those attributes physically. The date table is not needed.

  • Store the date and holiday indicator in the Calls table.
  • The holiday indicator would be a 1 byte field.
  • Remove the Date table.

Time
It is unusual to store all possible time including seconds in a separate table.

  • Move the time and interval to the calls table.
  • Remove the time table.

Mobile Service
This looks like a lookup table.

Rate Plan
This looks like a lookup table.

Calling Party
This looks like this gets populated from the cdr records. And then the activation date is updated on the record.

Derived Call Information

Option 1 if derived call information is mostly filled. This looks like it depends only on the call.

  • Move the [peak time indicator] and [call length classifer] fields to the calls table.
  • Then you can remove this table.


Option 2 if derived call information is mostly empty. You can save space in the Calls table by keeping the derived call table.

  • Add [card no] as the primary key. It is a 1 to zero-or-1 relationship. Every derived call must belong to a call. Not all calls have derived calls.

Edit: removed Calls.[clear time].