How to give a table a “primary selection” to a table it is many-to-many with

database-design

Description of Issue / Question

I want to be able to specify a default or primary relationship of a table to a row in the database that the first table already has a many-to-many relationship with.

Example / Usecase:

A user can be a member of multiple organizations. We do this by creating a many-to-many relationship via an association table between "user" and "organization". I want the user to be able to specify a "primary" organization

What I've tried

Schema:

organization:
  - id (integer, pkey)
  - name (varchar)

user
  - id (integer, pkey)
  - email (varchar)
  - primary_org_id (fkey to organization table) <-- "Primary Organization"

user_organization_assoc:
  - organization_id (integer, fkey to organization table)
  - user_id (integer, fkey to user table)
    * This also has a unique constraint on (organization_id, user_id)

My initial issue was DROP dependency issues – this was fixed by properly naming the foreign key constraints. It now "works", but several other theoretical issues arise with this solution.

The most obvious issue is the lack of integrity between the user_organization_assoc table and user.primary_org_id. When a user is removed from an organization, we have to remember to manually update the primary_org_id value of the user table.

Questions

How would you do this?
Am I going about this the wrong way?
This is a concept that is seen a lot in SaaS and other applications, are there any resources on this?

Disclosure

This development work was initially being done in python + sqlalchemy, but now it is mainly a question of principles / theory.

Thanks!

Best Answer

The fact that some relation is "primary" is an attribute of the relation. So it must be stored in relation table. For example, with the name is_primary.

Variant 1.

The field is defined as BOOLEAN (or INT(1) if no BOOLEAN datatype exists).

To ensure that there is only one primary relation per user we may create unique index by (user_id, is_primary). In such case all another, non-primary, records must have NULL in is_primary field. And when we need to change the primary relation we must alter two records - set NULL to previous primary and set TRUE to current one. Of course, the field is defined as nullable with DEFAULT NULL.

Variant 2.

The field is defined as DATETIME (or TIMESTAMP).

To ensure that there is only one primary relation per user we may create unique index by (user_id, is_primary) again. But now we do not need to "clear" the field value in previous primary - we only set this field in new primary to current timestamp, and, when we need the current primary, we simply select the record with the greatest value in this field.

The field do not need to be defined as nullable.

When we insert new record which is not primary, we may use an expression similar to COALESCE(MIN(is_primary) OVER (PARTITION BY user_id) - INTERVAL 1 SECOND, '1980-01-01'). The number of seconds in 10 years practically guarantees that there won't be some problems with set this field value to the new record.

Moreover, some kind of "change history" will exist - for each record we can see does it was somewhere primary one, and if it was - when it was assigned to primary the last time.

Related Solutions

MySQL data modeling help

I would use separate tables for the project-sponsor and event-sponsor relationships. You are making two many-to-many relationships here, and that just seems like the right way to do it to me.

Can each project/event/org have more than one leader? If not, why not simply put the leader id in the project/event/org table? Otherwise, I would use three tables for the same reason as above: you are describing three many-to-many relationships.

How to design relationships for variant data

This might force a bit of change for your tables, but we do something like this at work:

Ensure each table has a int/guid primary key of some sort:

Organization Table

ID | Name
1  | MyOrganization

Then create a type-table to house your different LeadSource types:

LeadSourceTypeKey | Description
1                 | Organization
2                 | Company Office

Then on your ProductSoldList Table add 2 columns LeadSourceID and LeadSourceType

Then you can easily query the ProductSoldList TABLE and know which LeadSource to join back to:

SELECT *
FROM   [ProductSoldList] PL
       INNER JOIN [Organization] O
       ON PL.LeadSourceID = O.ID AND
          PL.LeadSourceType = 1
       INNER JOIN [Company Office] CO
       ON PL.LeadSourceID = CO.ID AND
          PL.LeadSourceType = 2

Best Answer

Related Solutions

MySQL data modeling help

How to design relationships for variant data

Related Question