Database Design – Master Status Table vs Private Status Table Pros and Cons

database-design

I am the only DBA at our company and am relatively new in the profession (having 2.5 years of experience now). The developers and I are currently having a discussion about how to properly design the database for a new piece of software being developed. They develop their applications in modules and believe that having a private status table for each module is best, where as I believe having one table to hold all internally generated states (and then possibly additional tables to give those states some meaning) would be best. The sample code below is written in postgresql, since that's what we use.

My approach:

CREATE TABLE status (
  status_id SERIAL PRIMARY KEY,
  status_code VARCHAR(3) UNIQUE NOT NULL, -- short description
  status_description VARCHAR(50) UNIQUE NOT NULL -- long description
);

CREATE TABLE member_status (
  status_id INT PRIMARY KEY REFERENCES status (status_id)
);

Their approach:

CREATE TABLE member_status (
  status_id SERIAL PRIMARY KEY,
  status_code VARCHAR(3) UNIQUE NOT NULL, -- short description
  status_description VARCHAR(50) UNIQUE NOT NULL -- long description
);
-- rinse and repeat for every object that may need a status

They only see "problems" with my approach (which have yet to be clearly articulated) and, while I don't see problems with their approach, I do believe it is not the best design.

Any input into which design is better and WHY would be appreciated.

Best Answer

The main problem with your approach occurs when you want to maintain referrential integrity to the status table.

Suppose you have a member table:

CREATE TABLE member (
  member_id SERIAL PRIMARY KEY,
  status_id int NOT NULL,
  ...
);

Now if you want to enfore referrential integrity to ensure each member has a valid status, you can't. This is because only a subset of the status rows applies to members. As a result, a member can be assigned a purchase_order_status and RI will still pass.

EDIT2

In a comment you clarified that in your solution member would have RI to the single-column member_status table. This suggests your design could land up having a large number of single column tables, just to control refferential integrity. Now any time you add a new status, you need to add it to 2 tables.

There's another problem with your master_status table.

status_code is unique. Meaning that purchase_order and sales_order cannot use the same codes. This seems to be an unnecessarily harsh restriction. However, droping the unique constraint risks duplicating codes within a category.

So the next thing you might consider is adding a status_type and enforcing the unique constraint accross both type and code. So your status_type attempts to fix the underlying problem by artifically splitting the master table into multiple tables.

Basically the problem boils down to: You are trying to consolidate multiple distinct sets of data into a single table because there is a similarity in that some of their columns overlap.


Joe Celko has written a fairly in-depth article on Lookup Tables in SQL.


EDIT On second thoughts, perhaps your approach is not that much like "One True Lookup Table" because you do acknowledge the need for additional tables where more detail is required. The RI issue still remains however.