Is this data duplication when two exact pieces of data have different lifespans

database-designduplication

In my system, user has many bank accounts stored in bank_accounts table. Each bank account has a currency column. Now, users can receive payments to those bank accounts, stored in payments table. So, each payment is related to one row in bank_account table. In addition to bank_account_id, each row in payments table also has amount and currency columns. As you can see, the currency column on payments table causes data duplication, because the currency could always be derived from the related bank account. However, I'm afraid that in the future a bank account may be deleted from the system but the related payments will have to stay. Is it a valid argument to conclude that I don't have duplicated data in my schema?

Best Answer

Normalization rules are a useful guideline, not a strict law.

Even if you don't anticipate the currency field in the payments table to ever be different than the currency field for the related bank_account, I can think of any number of reasons to keep both:

  • What if the account record is ever deleted? (Accidentally or deliberately)
  • What if the account record changes currency type?
  • The field could be used as a point of immediate validation before committing the payment
  • The field could be used after the fact as an integrity/validation audit
  • It could improve performance for certain queries instead of requiring a join to the bank_accounts table.
Related Question