Postgresql – ENUM and ID as a Foreign Key

composite-typesdatabase-designenumforeign keypostgresql

Using PostgreSQL 11.

I've tried digging through both Stack Overflow and here and was unable to find an answer on best practices.

I'm working on a database design and have arrived at a schema that uses a generic "join table". This join table contains five columns:

CREATE TABLE many_joins_table (
  id PRIMARY KEY GENERATED BY DEFAULT AS IDENTITY,
  object_id int NOT NULL,
  object_table joins_object_t NOT NULL,
  parent_id int NOT NULL,
  parent_table joins_parent_t NOT NULL);

I've been using this table to represent adjacent and many-to-many relationships between objects in my database. One such example is tags.

CREATE TABLE tag (
  id PRIMARY KEY GENERATED BY DEFAULT AS IDENTITY,
  name text NOT NULL UNIQUE);

CREATE TABLE comment (
  id PRIMARY KEY GENERATED BY DEFAULT AS IDENTITY);


CREATE TYPE joins_object_t AS ENUM ('tag');
CREATE TYPE joins_parent_t AS ENUM ('comment');

When a tag is added to the comment table, I would insert a new row into this join table with the following fields:

INSERT INTO many_joins_table
VALUES(1, 'tag'::joins_object_t, 1, 'comment'::joins_parent_t);

Besides inflexibility of enums, addressed with PostgreSQL 9.1 https://stackoverflow.com/questions/1771543/adding-a-new-value-to-an-existing-enum-type/7834949#7834949.

Are there any significant disadvantages or advantages of such an approach? I'm worried that I've mistakenly implemented an anti-pattern. Are there any best practices I can apply to improve this implementation (indexing or constraints)?

Thanks!

Note: I'm aware there are better ways to implementing tags, namely using intarrays. I'm just using tags as an example since it's easily understood. https://stackoverflow.com/questions/23508551/integer-array-lookup-using-postgres

Edit: Removed the UUIDs since it may be a distraction to the question.

Best Answer

First, stay off enums for things like this. Enum values can never be removed, so use them only if you are sure that that will never be necessary, which doesn't seem the case here.

Anyway, I would say that your design is too complicated, and still lacks the crucial feature of referential integrity.

Use a junction table for each pair of objects that can be related. This way, you

  • make clear which objects can be related

  • can have referential integrity

Having many tables is something that a database is good at. If you contend that you have 1000 tables and each object can be related to each other, that would be too many tables. But in that case you should probably go for a model where you have not a table per object type anyway.