PostgreSQL – Developing a Database for Funds Transfers Business

database-designpostgresqlsubtypes

In the business context of relevance, both members and organizations need to have an account for funds. Funds can be transferred

from member to member,
from member to organization,
from organization to organization, and
from organization to member.

Considerations

In order to build a database for such a scenario, I have created the following three tables:

CREATE TABLE Members ( 
  memberid serial primary key, 
  name varchar(50) unique, 
  passwd varchar(32), 
  account integer 
);

CREATE TABLE Organizations (  
  organizationid serial primary key, 
  name varchar(150) unique, 
  administrator integer references Members(memberid), 
  account integer 
);

CREATE TABLE TransferHistory  
  "from" integer, -- foreign key?
  "to" integer, -- foreign key?
  quantity integer 
);

I think that the TransferHistory table is necessary to show who/what sent funds to whom/what.

The problem is, since Members and Organizations are different tables, how can I reference them from the TransferHistory table?

For instance, the data involved can show up as follows:

Account      Account      Quantity
-----------  -----------  --------
 1072561733  38574637847       500
38574637847   1072561733       281

That would suggest that accounts need to be recorded in the same table, but accounts are for two different kinds of owners (members and organizations), each of which are retained in their respective table.

I could create a table called Accounts, so now I would have four tables:

CREATE TABLE Members (
  memberid serial primary key, 
  name varchar(50) unique, 
  passwd varchar(32), 
  accountid integer references Accounts(accountid) 
);

CREATE TABLE Organizations (
  organizationid serial primary key, 
  name varchar(150) unique, 
  administrator integer references Members(memberid), 
  accountid integer references Accounts(accountid)
);

CREATE TABLE Accounts ( 
  accountid serial primary key, 
  state integer 
);

CREATE TABLE TransferHistory ( 
  "from" integer references Accounts(accountid), 
  "to" integer references Accounts(accountid), 
  quantity integer 
);

…but now I have to ensure that each foreign key from Members and Organizations tables does not point to the same Account row in Accounts table…

…or I could have an Accounts table having two foreign keys, one pointing to Members and another to Organizations (and one foreign key column would have to contain a NULL mark at all times). But now things get a bit confusing with respect to queries. The database design would be, generally speaking, as follows:

CREATE TABLE Members ( 
  memberid serial primary key, 
  name varchar(50) unique, 
  passwd varchar(32) 
);

CREATE TABLE Organizations (
  organizationid serial primary key, 
  name varchar(150) unique, 
  administrator integer references Members(memberid) 
);

CREATE TABLE Accounts ( 
  accountid serial primary key, 
  member integer references Members(memberid), 
  organization integer references Organizations(organizationid),
  state integer 
);

CREATE TABLE TransferHistory ( 
  "from" integer references Accounts(accountid), 
  "to" integer references Accounts(accountid), 
  quantity integer 
);

So, does anybody have a suggestion on how to solve this problem?

Best Answer

If the intention is to build a relational database, it is really helpful to first carry out (a) an analysis of the business context of interest —in order to delineate a conceptual schema— in terms of entity types, inspecting their properties and associations before (b) thinking in terms of tables, columns and constraints —aspects that correspond to the logical level—. Following this course of action, it is much simplier to capture the meaning of the business domain with accuracy and then reflect it in an actual, well constrained, SQL-DDL design.

One of the numerous advantages offered by the relational paradigm is that it permits managing the data in its natural structure; hence one has to “find” such structure before employing relational instruments to manage it. It does not matter if the scenario at issue is related to a personal project (as you pointed out via comments): the more realistic you define it, the more you will learn from its development (if that is the purpose of this effort). Of course, a realistic personal project may evolve into a comercial one with relatively minor adaptations.

Business rules

In order to present a first progression that you may like to use as a reference, I have formulated some of the conceptual-level business rules that are among the most significant ones, and they are enumerated as follows:

A Person owns zero-one-or-many Accounts
A Person is primarily distinguished by its Id
A Person is alternately distinguished by his/her Firstname, Lastname, BirthDate and Gender
An Organization owns zero-one-or-many Accounts
An Organization is primarily differentiated by its Id
An Organization is alternately differentiated by its Name
An Organization starts operating at a FoundingDate
An Account is the Transferor in zero-one-or-many Transfers
An Account is the Transferee in zero-one-or-many Transfers
An Account is primarily identified by its Number
An Account is issued at an exact CreatedDate
In a Transfer, the Transferor must be different from the Transferee
A Person may log in via zero-or-one UserProfile

Since the associations —or relationships— (1) between Person and Account and (2) between Organization and Account are very similar, this fact reveals that Person and Account are entity subtypes of Party (basically, either an individual or a group of individuals), which is in turn their entity supertype. This is a classic information structure that arises very frequently in multiple conceptual models of diverse kinds. In this manner, two new rules can be asserted:

A PartyType classifies zero-one-or-many Parties
A Party is either a Person or an Organization

And two of the previous business rules can be consolidated into a single one:

A Party owns zero-one-or-many Accounts

Which can be also be stated from the point of view of the Account entity type:

An Account is owned by exactly-one Party

Expository IDEF1X diagram

Consequently, I created an expository (simplified) IDEF1X^† diagram that synthesizes the rules formulated above, and it is shown in Figure 1:

Figure 1 - Funds Transfer IDEF1X Model

Party, Person and Organization: Supertype-subtype structure

As demonstrated, Person and Organization are depicted as mutually exclusive subtypes of Party.

The Party supertype holds a discriminator (i.e., PartyTypeCode) and all the properties (or attributes) that are common to its subtypes, which, in turn, have the properties that apply to each of them.

Account

The Account entity type is directly connected with Party, which provides a subsequent connection between (i) Account and Person, and between (ii) Account and Organization.

Since it is possible that, in the real world, (a) a bank Account is not transferable, i.e., its Owner cannot change and (b) an Account cannot start being current or enabled without an Owner, the PRIMARY KEY of this entity type may be comprised of the properties PartyId and AccountNumber, so you should analyze the scenario even more thoroughly to define this point with high precision.

Transfer

On the other hand, the Transfer entity type presents a composite PRIMARY KEY made up of three properties, i.e., TransferorAccountNumber, TransfereeAccountNumber (role names I assigned to distinguish every one of two Account properties involved in each Transfer instance) and TransferDateTime (which tells the exaxct Instant when a Transfer occurrence was performed).

Factors about AccountNumbers

Be also aware that, in actual banking systems, the format of an AccountNumber data point is usually more complex than a “mere” integer value. There are different format arrangements, e.g., the one that corresponds to the International Bank Account Number (IBAN), defined by the ISO 13616 standard. This aspect implies, evidently, that the (1) conceptual analysis and the later (2) logical definitions require a much more exhaustive approach.

Illustrative logical SQL-DDL declarations

Then, as a derivation from the previous analysis, I have declared a logical design where

each table represents an entity type,
each column stands for a property of the respective entity type, and
multiple constraints are set up (declaratively) in order to guarantee that the assertions in form of rows retained in all the tables comply with the business rules determined at the conceptual tier.

I have supplied notes as comments that explain some of the features that I esteem particularly important with respect to the aforementioned structure, that is shown below:

-- You have to determine which are the most fitting 
-- data types and sizes for all your table columns 
-- depending on the business context characteristics.

-- Also, you should make accurate tests to define the
-- most convenient physical implementation settings; e.g.,
-- a good INDEXing strategy based on query tendencies.

-- As one would expect, you are free to make use of 
-- your preferred (or required) naming conventions. 

CREATE TABLE PartyType (
    PartyTypeCode CHAR(1)  NOT NULL, -- This one is meant to contain the meaningful values 'P', for 'Person', and 'O' for 'Organization'.
    Name          CHAR(30) NOT NULL,
    --
    CONSTRAINT PartyType_PK PRIMARY KEY (PartyTypeCode)
);

CREATE TABLE Party ( -- Represents the supertype.
    PartyId         INT       NOT NULL,
    PartyTypeCode   CHAR(1)   NOT NULL, -- Denotes the subtype discriminator.
    CreatedDateTime TIMESTAMP NOT NULL,  
    Etcetera        CHAR(30)  NOT NULL,  
    --
    CONSTRAINT Party_PK            PRIMARY KEY (PartyId),
    CONSTRAINT PartyToPartyType_FK FOREIGN KEY (PartyTypeCode)
        REFERENCES PartyType (PartyTypeCode)
);

CREATE TABLE Person ( -- Stands for a subtype.
    PersonId        INT      NOT NULL, -- To be CONSTRAINed as PRIMARY KEY and FOREIGN KEY at the same time, enforcing an association cardinality of one-to-zero-or-one from Party to Person.
    FirstName       CHAR(30) NOT NULL,
    LastName        CHAR(30) NOT NULL,
    GenderCode      CHAR(3)  NOT NULL,
    BirthDate       DATE     NOT NULL,
    Etcetera        CHAR(30) NOT NULL,  
    --
    CONSTRAINT Person_PK        PRIMARY KEY (PersonId),
    CONSTRAINT Person_AK        UNIQUE ( -- Composite ALTERNATE KEY.
        FirstName,
        LastName,
        GenderCode,
        BirthDate
    ),
    CONSTRAINT PersonToParty_FK FOREIGN KEY (PersonId)
        REFERENCES Party (PartyId)
);

CREATE TABLE Organization ( -- Represents the other subtype.
    OrganizationId  INT      NOT NULL, -- To be CONSTRAINed as PRIMARY KEY and FOREIGN KEY simultaneously, enforcing a association cardinality of one-to-zero-or-one from Party to Organization.
    Name            CHAR(30) NOT NULL,
    FoundingDate    DATE     NOT NULL,
    Etcetera        CHAR(30) NOT NULL,  
    --
    CONSTRAINT Organization_PK        PRIMARY KEY (OrganizationId),
    CONSTRAINT Organization_AK        UNIQUE      (Name), -- ALTERNATE KEY.
    CONSTRAINT OrganizationToParty_FK FOREIGN KEY (OrganizationId)
        REFERENCES Party (PartyId)
);

CREATE TABLE UserProfile (
    UserId          INT       NOT NULL, -- To be CONSTRAINed as PRIMARY KEY and FOREIGN KEY at the same time, enforcing an association cardinality of one-to-zero-or-one from Person to UserProfile.
    UserName        CHAR(30)  NOT NULL,
    CreatedDateTime TIMESTAMP NOT NULL,
    Etcetera        CHAR(30)  NOT NULL,  
    --
    CONSTRAINT UserProfile_PK         PRIMARY KEY (UserId),
    CONSTRAINT UserProfile_AK         UNIQUE      (Username),
    CONSTRAINT UserProfileToPerson_FK FOREIGN KEY (UserId)
        REFERENCES Person (PersonId)  
);

CREATE TABLE Account (
    AccountNumber   INT       NOT NULL,
    OwnerPartyId    INT       NOT NULL, -- A role name assigned to PartyId in order to depict the meaning it carries in the context of an Account.
    CreatedDateTime TIMESTAMP NOT NULL,
    Etcetera        CHAR(30)  NOT NULL,  
    --
    CONSTRAINT Account_PK        PRIMARY KEY (AccountNumber),
    CONSTRAINT AccountToParty_FK FOREIGN KEY (OwnerPartyId)
        REFERENCES Party (PartyId)
);

CREATE TABLE Transfer (
    TransferorAccountNumber INT       NOT NULL, -- Role name assigned to AccountNumber.
    TransfereeAccountNumber INT       NOT NULL, -- Role name assigned to AccountNumber
    TransferDateTime        TIMESTAMP NOT NULL,  
    Amount                  INT       NOT NULL, -- Retains the Amount in Cents, but there are other possibilities.
    Etcetera                CHAR(30)  NOT NULL, 
    --
    CONSTRAINT Transfer_PK             PRIMARY KEY (TransferorAccountNumber, TransfereeAccountNumber, TransferDateTime), -- Composite PRIMARY KEY.
    CONSTRAINT TransferToTransferor_FK FOREIGN KEY (TransferorAccountNumber)
        REFERENCES Account (AccountNumber),
    CONSTRAINT TransferToTransferee_FK FOREIGN KEY (TransfereeAccountNumber)
        REFERENCES Account (AccountNumber),
    CONSTRAINT AccountsAreDistinct_CK  CHECK       (TransferorAccountNumber <> TransfereeAccountNumber),
    CONSTRAINT AmountIsValid_CK        CHECK       (Amount > 0)
);

As noted, there is no need for the retention of ambiguous and problematic NULL marks in the columns of any of the base tables.

If you want to know if an Account involved in a certain Transfer is owned by an Organization or a Person, you can derive such information in a single SELECT statement via, e.g., the Transfer.TrasnferorAccountNumber, the Account.PartyId, and the Party.PartyTypeCode columns.

So that you can ensure that a Party can own at most one Account (as stated in comments), then you should fix a UNIQUE constraint for the Account.PartyId column. However, in real world scenarios, e.g., in a bank, a Person can own zero-one-or-many Accounts, thus I esteem that a one-to-zero-or-one association does not appear realistic.

As previously mentioned, the approach proposed in this answer is supposed to be utilized as a reference that you can extend and adapt on your own. Naturally, extensions and adaptations made at the conceptual level should be reflected in the logical model.

I tested the declaration of this structure in (i) this db<>fiddle and in (ii) this SQL Fiddle, both running on PostgreSQL 9.6 (you initially attached the tag of this database management system).

Integrity and consistency considerations regarding the tables Party, Person and Organization

With the layout described above, one must guarantee that each “supertype” row is at all times complemented by its corresponding “subtype” counterpart and, in turn, ensure that said “subtype” row is compatible with the value contained in the supertype “discriminator” column.

It would be very handy and elegant to enforce such circumstances declaratively but, unfortunately, none of the major SQL platforms has provided the proper mechanisms to do so (as far as I know). Therefore, it is quite convenient to make use of ACID TRANSACTIONS so that these conditions are always confidently met in a database.

Similar scenarios

In case you are interested in other business domains in which supertype-subtype structures come about, you might like to see my answers to

Relevant resource

These Stack Overflow posts cover very relevant points concerning the data type of a column that holds a currency datum, like Transfer.Amount, in PostgreSQL.

Endnote

_{^† Integration Definition for Information Modeling (IDEF1X) is a highly recommendable data modeling technique that was established as a standard in december 1993 by the United States National Institute of Standards and Technology (NIST). It is solidly based on (a) some of the early theoretical works authored by the originator of the relational model, i.e., Dr. E. F. Codd; on (b) the Entity-Relationship view, developed by Dr. P. P. Chen; and also on (c) the Logical Database Design Technique, created by Robert G. Brown.}

pgAdmin timing

When you execute a query from the query tool, the message pane shows something like:

Total query runtime: 62 ms.

And the status line shows the same time. I quote pgAdmin help about that:

The status line will show how long the last query took to complete. If a dataset was returned, not only the elapsed time for server execution is displayed, but also the time to retrieve the data from the server to the Data Output page.

If you want to see the time on the server you need to use SQL EXPLAIN ANALYZE or the built in Shift + F7keyboard shortcut or Query -> Explain analyze. Then, at the bottom of the explain output you get something like this:

Total runtime: 0.269 ms

Best Answer

Business rules

Expository IDEF1X diagram

Illustrative logical SQL-DDL declarations

Similar scenarios

Relevant resource

Endnote

Related Solutions

How to design relationships for variant data

Postgresql – Postgres multiple joins slow query, how to store default child record

pgAdmin timing

Related Question