Database Design – Splitting Information About People, Accounts, and Transactions

database-design

I am having trouble deciding which design would be considered a better practice.

Say, for example, I have two tables:

TABLE Person
pid int primary key
name varchar
age int
bank_account varchar

TABLE Transactions
tid int primary key
date datetime
amount float
from int references person
to int references person

This is just an example and not using any particular syntax.

Now, my Person table will have multiple entries that have all the same information aside from the bank_account, since a person can have multiple different accounts they make transactions from or receive to. Would it be better to do it like this, or would it be better to create a separate account table that holds all the different account numbers with a reference to the person that owns it, and then changing the references in the Transactions table to reference the new account table?

I know that duplicating information in person is in accordance with 1NF, but it doesn't seem like good design to me. Any insight would be appreciated.

Best Answer

It seems a cliché, but it all depends on the specific informational requirements of relevance, because a database has to be designed with the exact business needs in mind.

As for the circumstances at issue:

In case that in the applicable business environment each Transaction is performed from Person to Person and the concept of Account does not exist, then leaving the tables as you decribe in the no-particular-sintax example seems to be mirroring the characteristics of the business context accurately.
In case that, on the contrary, in said business scenario every Transaction is carried out from Account to Account, then, yes, creating an Account(s) table which would be referred to from two FOREIGN KEY constraints in the Transaction(s) table would be the best practice, because by doing so you would be reflecting the characteristics of the business domain with precision. The Transaction(s) rows would then be associated with the respective Person(s) rows via the corresponding Account(s) counterparts.

You state in the question itself that

[...] a person can have multiple different accounts they make transactions from or receive to.

...therefore, proceeding as mentioned in case no. 2 would be more convenient in your very specific situation.

On the other hand, when designing a relational database, I highly advise that you start (i) analyzing the situation from a conceptual perspective —identifying entity types, properties, associations, drawing diagrams, etc.—, then (ii) continue establishing the logical aspects —laying out the corresponding tables, types, columns, constraints to ensure integrity, domains if possible, testing with the normal forms, etc.—, and then (iii) putting up the physical settings of convenience —optimizing database response by configuring indexes, partitions, hardware upgrades, network configurations, etc.—. In this manner you would be approaching each scenario with a clear separation of concerns.

Regarding first normal form (1NF) —which is brought up in the question—, it basically has to do with the definition of atomic domains (i.e., each column must contain exactly one value of the applicable domain/type per row), which, as well is entirely dependent on the informational requirements of significance. Further normalization (2NF and above), in turn, implies getting rid of dangerous redundancies (special types of undesirable data duplication) and is executed in terms of functional dependencies (interrelationships between sets of attributes/columns).

By the way, you may find of interest this Q & A, which involves a similar scenario and contains several notions about database design that may be applied from a general point of view

Related Solutions

Sql-server – Do I need a separate Id column for this “mapping” table

If you have a one-to-many relationship between Producers and Products (in other words, a product can only belong to one producer), than it would make sense to just put a foreign key reference directly in your Products table:

One-To-Many

create table Producer ( id int identity(1, 1) not null primary key clustered, Name varchar(100) not null ) go create table Product ( id int identity(1, 1) not null, Name varchar(100) not null, ProducerId int not null foreign key references Producer(id) ) go

But if there's a chance that this will be a many-to-many relationship, then your best bet would be to use a Join table.

Many-To-Many

create table Producer
(
    id int identity(1, 1) not null primary key clustered,
    Name varchar(100) not null
)
go

create table Product
(
    id int identity(1, 1) not null primary key clustered,
    Name varchar(100) not null
)
go

create table ProductProducer
(
    ProductId int not null foreign key references Product(id),
    ProducerId int not null foreign key references Producer(id)
)
go

-- adding the primary key also ensures uniqueness
alter table ProductProducer
add constraint PK_ProductProducer 
primary key (ProductId, ProducerId)
go

If you decide to go with the Join table, you wouldn't need to have an additional key, as the combination of ProductId/ProducerId would ultimately be unique. You could use them as a composite key, so you wouldn't need that additional Id field in ProductProducer.

Mysql – Using the same primary key as a foreign key for every table

In general it is bad schema design to have two tables in a 1:1 relationship. (There are exceptions, but yours is not one.)

In general, birthdate should be a single column of type DATE, not 3 columns.

Best Answer

Related Solutions

Sql-server – Do I need a separate Id column for this “mapping” table

Mysql – Using the same primary key as a foreign key for every table

Related Question