How to give a table a “primary selection” to a table it is many-to-many with

database-design

Description of Issue / Question

I want to be able to specify a default or primary relationship of a table to a row in the database that the first table already has a many-to-many relationship with.

Example / Usecase:

A user can be a member of multiple organizations. We do this by creating a many-to-many relationship via an association table between "user" and "organization". I want the user to be able to specify a "primary" organization

What I've tried

Schema:

organization:
  - id (integer, pkey)
  - name (varchar)

user
  - id (integer, pkey)
  - email (varchar)
  - primary_org_id (fkey to organization table) <-- "Primary Organization"

user_organization_assoc:
  - organization_id (integer, fkey to organization table)
  - user_id (integer, fkey to user table)
    * This also has a unique constraint on (organization_id, user_id)

My initial issue was DROP dependency issues – this was fixed by properly naming the foreign key constraints. It now "works", but several other theoretical issues arise with this solution.

The most obvious issue is the lack of integrity between the user_organization_assoc table and user.primary_org_id. When a user is removed from an organization, we have to remember to manually update the primary_org_id value of the user table.

Questions

  1. How would you do this?
  2. Am I going about this the wrong way?
  3. This is a concept that is seen a lot in SaaS and other applications, are there any resources on this?

Disclosure

This development work was initially being done in python + sqlalchemy, but now it is mainly a question of principles / theory.

Thanks!

Best Answer

The fact that some relation is "primary" is an attribute of the relation. So it must be stored in relation table. For example, with the name is_primary.

Variant 1.

The field is defined as BOOLEAN (or INT(1) if no BOOLEAN datatype exists).

To ensure that there is only one primary relation per user we may create unique index by (user_id, is_primary). In such case all another, non-primary, records must have NULL in is_primary field. And when we need to change the primary relation we must alter two records - set NULL to previous primary and set TRUE to current one. Of course, the field is defined as nullable with DEFAULT NULL.

Variant 2.

The field is defined as DATETIME (or TIMESTAMP).

To ensure that there is only one primary relation per user we may create unique index by (user_id, is_primary) again. But now we do not need to "clear" the field value in previous primary - we only set this field in new primary to current timestamp, and, when we need the current primary, we simply select the record with the greatest value in this field.

The field do not need to be defined as nullable.

When we insert new record which is not primary, we may use an expression similar to COALESCE(MIN(is_primary) OVER (PARTITION BY user_id) - INTERVAL 1 SECOND, '1980-01-01'). The number of seconds in 10 years practically guarantees that there won't be some problems with set this field value to the new record.

Moreover, some kind of "change history" will exist - for each record we can see does it was somewhere primary one, and if it was - when it was assigned to primary the last time.