Postgresql – Enforcing constraints “two tables away”

database-designforeign keypolymorphic-associationspostgresqlreferential-integrity

I ran into some trouble modeling an electrical schematic in SQL.
The structure I'd like to capture is

  part ←────────── pin
   ↑                ↑
part_inst ←───── pin_inst

where "inst" is short for "instance".

For example, I might have as a part an LM358 op-amp with pins
1OUT, 1IN-, 1IN+, GND, 2IN+, 2IN-, 2OUT, and VCC. I might
then place this part on a schematic, creating a part_inst and 8
pin_insts.

Ignoring data fields, my initial attempt at a schema was

create table parts (
    part_id bigserial primary key
);
create table pins (
    pin_id bigserial primary key,
    part_id bigint not null references parts
);
create table part_insts (
    part_inst_id bigserial primary key,
    part_id bigint not null references parts
);
create table pin_insts (
    pin_inst_id bigserial primary key,
    part_inst_id bigint not null references part_insts,
    pin_id bigint not null references pins
);

The main problem with this schema is that a pin_inst might be tied
to a part_inst with part_id=1 but its pin has part_id=2.

I'd like to avoid this problem on the database level rather than the
application level. So, I modified my primary keys to enforce that.
I marked the changed lines with --.

create table parts (
    part_id bigserial primary key
);
create table pins (
    pin_id bigserial,                                          --
    part_id bigint not null references parts,
    primary key (pin_id, part_id)                              --
);
create table part_insts (
    part_inst_id bigserial,                                    --
    part_id bigint not null references parts,
    primary key (part_inst_id, part_id)                        --
);
create table pin_insts (
    pin_inst_id bigserial primary key,
    part_inst_id bigint not null,                              --
    pin_id bigint not null,                                    --
    part_id bigint not null references parts,                  --
    foreign key (part_inst_id, part_id) references part_insts, --
    foreign key (pin_id, part_id) references pins              --
);

My gripe with this method is that it pollutes the primary keys:
Everywhere I refer to a part_inst, I need to keep track of both the
part_inst_id and the part_id. Is there another way I can go about enforcing the constraint
pin_inst.part_inst.part_id = pin_inst.pin.part_id without being overly verbose?

Best Answer

Minimal solution

One radical solution might be to remove pin_inst completely:

  part ←────────── pin
    ↑               
part_inst ←───── pin_inst

There is nothing in your question to suggest you actually need the redundant table. For pins associated to a part_inst, look at the pins of the associated part.

That would simplify the code to:

create table part (    -- using singular terms for table names
    part_id bigserial primary key
);
create table pin (
    pin_id bigserial primary key,
    part_id bigint not null references part
);
create table part_inst (
    part_inst_id bigserial primary key,
    part_id bigint not null references part
);

But your comment made clear that we won't get away with that ...

Alternative if pin_inst is needed

Including part_id like you did is the simplest solution with foreign key constraints. You cannot reference a table “two tables away” with foreign key constraints.

But you can at least make do without "polluting" the primary keys. Add UNIQUE constraints.

create table part (
    part_id bigserial primary key
);
create table pin (
    pin_id bigserial primary key,
    part_id bigint not null references part,
    unique(part_id, pin_id)         -- note sequence of columns
);
create table part_inst (
    part_inst_id bigserial primary key,
    part_id bigint not null references part,
    unique(part_id, part_inst_id)
);
create table pin_inst (
    pin_inst_id bigserial primary key,
    part_inst_id bigint not null,
    pin_id bigint not null,
    part_id bigint not,
    foreign key (part_id, pin_id) references pin,
    foreign key (part_id, part_inst_id) references part_inst
);

I put part_id first in the unique constraints. That is irrelevant for the referential integrity, but it matters for performance. The primary keys already implement indexes for the pk columns. It's better to have the other column first in the multicolumn indexes implementing the unique constraints. Details under these related questions:

Related questions on SO:

Alternative with triggers

You could resort to triggers functions, which are more flexible, but a bit more complicated and error prone and a bit less strict. The benefit: you could do without part_inst.part_id and pin.part_id ...