Postgresql – Define Field as a Foreign Key

database-designpostgresqlschema

I am receiving data from an external source in batches that I need to import to my PostgreSQL database. I have 2 main tables: clients and providers. Ideally, they should have a 1:1 relationship, however, this isn't always going to be true with the data I receive. There are cases where a client will be sent in a batch and it's provider counterpart will only be sent in a later batch (or vice versa).

My question is what is the best way to work around this issue? Currently, I have the two tables that are unlinked but have the potential to be if the "keys" are present in both tables but something tells me this isn't the best way to go about things.

Looking forward to hearing your thoughts!

Many thanks

Best Answer

There are two kinds of foreign keys: declared foreign keys and undeclared foreign keys. Most people are familiar with declared foreign keys, but some don't realize that there is such a thing as an undeclared foreign key.

There are several advantages to declaring a foreign key. The big advantage is rejecting invalid data at input time. For interactive data entry, this is very valuable.

In your case, however, it's a detriment. You probably want to store all the data you are given, even if some of it appears to be dirty data. Later, you can go through some kind of cleansing operation where you detect certain kinds of errors, like orphaned references, and flag them for human intervention.

You would use this field in join conditions precisely the way you use foreign keys. This is what is meant by an undeclared foreign key. If there are any orphaned references, they will drop out of the join, and you need to program accordingly.

In a more formal situation, like a data warehouse, incoming data is loaded into a staging table at first, and only added to the warehouse data after cleansing. That's probably overkill in your case.

Related Solutions

Postgresql – Replication of specific tables using Postgres 9.4.4 and Amazon Web Services

According to the AWS Documentation, logical replication is possible from RDS Postgres instances, starting with version 9.4.9. That would solve the RDS -> local direction.

The documentation doesn't make it very clear, but it seems that from version 10.4 an RDS instance can even be the receiver end (the subscriber) of logical replication:

Beginning with PostgreSQL version 10.4, RDS supports the publication and subscription SQL Syntax for PostgreSQL 10 Logical Replication.

Defining Two Distinct Relationships Between User Table and Company Table in MySQL

To guarantee that a company, which can have many users, can ever have at most one admin, and further guarantee that the user defined as the admin is also a user defined for the same company, create a separate admin table. However, the foreign key definition has to be created in a specific way.

create table Companies(
  ID      int not null auto_increment,
  Name    varchar not null,
  Address varchar,
  primary key( ID )
);

create table Users(
  ID      int not null auto_increment,
  Name    varchar not null,
  primary key( ID )
);

create table CompanyUsers(
  UserID    int not null references Users( ID ),
  CompanyID int not null references Companies( ID ),
  JoinDate  date not null,
  LeaveDate date,
  primary key( UserID ), -- makes this 1-n
  unique( UserID, CompanyID )
);

create table Admins(
    UserID   int not null,
    CompanyID int not null,
    ..., -- other admin-related data
    primary key CompanyID,
    foreign key( UserID, CompanyID ) references CompanyUsers( UserID, CompanyID )
);

At first, the Admins table appears to define a 1-n relationship the same way as the CompanyUsers table. But if you follow the FK relationship through, you will see that it can only support 1-1 relationships.

Technically, you don't need the CompanyUsers table as you can define the company and dates in the Users table. This maintains the 1-n relationship but then it is no longer in 3nf. If you are going to have attributes (JoinDate, LeaveDate) that are dependent on more than the PK (that is, they describe the relationship rather than the user), they should go in a new table.

Best Answer

Related Solutions

Postgresql – Replication of specific tables using Postgres 9.4.4 and Amazon Web Services

Defining Two Distinct Relationships Between User Table and Company Table in MySQL

Related Question