New table or new column

database-design

Apologies in advance if this is a dumb question.

Background:
I am a student intern, I've been given the task of adding a new feature the companies client administration system. Basically it involves "purging" subscriptions, they want selected subscriptions to be removed from the client admin system but remain in the database for recording purposes. We currently have around 600,000 subscriptions. They wish to "purge" 10,000 subscriptions right away and another 30,000 over the next few months, so around 6% of the subscriptions table.

What I am wondering is what is the best way to accomplish this. I have come up with two options;

a) Add a new table to the database to store the "purged" subscriptions with 3 columns, purgedId, subscriptionId which is a fk referencing the subscription, and purgedDate. Then in any sql that references the subscription table left outer join the purged table and use "where purgedId = null".

b) Add a new column to the subscription table to store the purgedDate and add "where purgedDate = null" to all of my subscription sql.

Option a) seems like a waste, to add a whole new table to store essentially one attribute.
Option b) seems like a waste, to add a column that is only used by 6% of the rows in the table.

On another note, is an outer left join the best way to exclude the purged subscriptions?

Thanks in advance for any assistance you are able to provide and once again sorry if this is an elementary question.

Best Answer

I'd go with option 3, which will require less overall work:

  • Make a lookup table that contains a list of purged records and dates, like in your first option
  • Rename your current table to something like dbo.Tablename_master
  • Create a VIEW named dbo.Tablename which consists of:

.

SELECT <fields>
FROM dbo.Tablename_Master
WHERE PrimaryKeyField NOT IN (SELECT PrimaryKeyField FROM PurgeTable)

Now you don't need to modify any existing code, since all references to your original table go through the view instead. This will be complicated if you have any existing constraints but they should be easy to work around.