Sql-server – designing a star schema

sql serverstar-schema

I am a newbie and would appreciate suggestions/material to research

I have 3 types on information in a single table

  1. problems

  2. reasons

  3. fixes

problem data is related to reasons by 1 to many

reasons data is related to fixes by 1 to many

I am thinking of creating a dimension table for dates as mentioned here

How could I design a star schema out of it? As of now the entire data is in a single table (all joined together)

#####################
update 1 – example data and my 1st solution below. Please guide

my columns are

problem id
problem description
problem associate department
problem due date
reason id
reason description
reason associated person
reason due date
fix id
fix description
fix owning employee
fix due date

example data

enter image description here

+------------+---------------------+------------------------------+------------------+-----------+--------------------------+--------------------------+-----------------+--------+----------------------+---------------------+--------------+
| problem id | problem description | problem associate department | problem due date | reason id |    reason description    | reason associated person | reason due date | fix id |   fix description    | fix owning employee | fix due date |
+------------+---------------------+------------------------------+------------------+-----------+--------------------------+--------------------------+-----------------+--------+----------------------+---------------------+--------------+
|          1 | wrong customer info | sales                        | 1-Jan-21         |         1 | customer gave wrong info | a                        | 15-Dec-20       |      1 | contact customer     | bb                  | 10-Dec-20    |
|          1 | wrong customer info | sales                        | 1-Jan-21         |         2 | customer gave wrong info | a                        | 15-Dec-20       |      2 | verify customer info | cc                  | 15-Dec-20    |
|          1 | wrong customer info | sales                        | 1-Jan-21         |         1 | data entry error         | b                        | 16-Dec-20       |      1 | correct data entry   | bb                  | 16-Dec-20    |
|          1 | wrong customer info | sales                        | 1-Jan-21         |         1 | database wrong join      | c                        | 20-Dec-20       |      1 | fix join             | aa                  | 19-Dec-20    |
|          1 | wrong customer info | sales                        | 1-Jan-21         |         2 | database wrong join      | c                        | 20-Dec-20       |      2 | something else       | dd                  | 19-Dec-20    |
|          1 | wrong customer info | sales                        | 1-Jan-21         |         3 | database wrong join      | c                        | 20-Dec-20       |      3 | modify tables        | bb                  | 19-Dec-20    |
+------------+---------------------+------------------------------+------------------+-----------+--------------------------+--------------------------+-----------------+--------+----------------------+---------------------+--------------+

I am thinking of a design like below. all joins would be 1:1 in case of fact table. In case of fact table2 calendar there would be 1:many join as that fact table would have one row per date while the dim tables could have same date on multiple rows

enter image description here

Best Answer

A very basic star-schema for your situation might look like this:

enter image description here

The T-SQL statements to create the dimension tables for this schema look like:

CREATE TABLE dbo.problems
(
    problem_id int NOT NULL
        IDENTITY(1,1)
        CONSTRAINT problems_pk
        PRIMARY KEY
        CLUSTERED
    , problem_description nvarchar(400) NOT NULL
);

CREATE TABLE dbo.reasons
(
    reason_id int NOT NULL
        IDENTITY(1,1)
        CONSTRAINT reasons_pk
        PRIMARY KEY
        CLUSTERED
    , problem_id int NOT NULL
        CONSTRAINT reasons_problem_fk
        FOREIGN KEY
        REFERENCES dbo.problems(problem_id)
    , reason_text nvarchar(500) NOT NULL
);

CREATE TABLE dbo.fixes 
(
    fix_id int NOT NULL
        IDENTITY(1,1)
        CONSTRAINT fixes_pk
        PRIMARY KEY
        CLUSTERED
    , reason_id int NOT NULL
        CONSTRAINT fixes_reason_fk
        FOREIGN KEY
        REFERENCES dbo.reasons(reason_id)
    , fix_text nvarchar(525) NOT NULL
);

CREATE TABLE dbo.dates
(
    date_id int NOT NULL
        IDENTITY(1,1)
        CONSTRAINT dates_pk
        PRIMARY KEY
        CLUSTERED
    , starting_date datetime NOT NULL
    , ending_date datetime NOT NULL
    , CONSTRAINT dates_dates
        CHECK (ending_date > starting_date)
);

The fact table, which you seem to have left out of your question, and I've named incidents, looks like:

CREATE TABLE dbo.incidents
(
    incident_id int NOT NULL
        CONSTRAINT incidents_pk
        PRIMARY KEY
        CLUSTERED
    , incident_date_id int NOT NULL
        CONSTRAINT incidents_incident_date
        FOREIGN KEY
        REFERENCES dbo.dates (date_id)
    , problem_id int NOT NULL
        CONSTRAINT incidents_problem_fk
        FOREIGN KEY
        REFERENCES dbo.problems(problem_id)
    , fix_id int NULL
        CONSTRAINT incidents_fix_fk
        FOREIGN KEY
        REFERENCES dbo.fixes(fix_id)
    , fix_date_id int NOT NULL
        CONSTRAINT incidents_fix_date
        FOREIGN KEY
        REFERENCES dbo.dates (date_id)
);

incidents references the dimension tables such that you are storing the bare minimum of information in the fact table. The fix_id column is referenced in the incidents table to indicate which of the possibly many fixes was accepted as the actual fix used in this incident. Since each fix is related to only a single reason, you can infer the reason from the fix. I've included a discrete column to indicate the problem in this table to enable incidents that don't yet have a fix.

You could add a reason_id foreign-key column to the incidents table if desired. For instance, maybe you identify a reason prior to determining the fix.