Mysql – Linking foreign keys across multiple databases: direct, or using an intermediary table

database-designforeign keymariadbmultiple-databaseMySQL

I want to make a part of my application reusable, and that warrants moving the corresponding tables into a separate database. So for the sake of an example, please consider the two imaginary databases in the list that follows. (More databases sharing the same logic may be added as the project grows.)

users containing tables related to user sign ups, login and e-mail history, password reset tokens etc., as well as the accounts themselves; and
blogs having tables for posts, media files, comments, etc.

Each table in the blogs database must obviously have an account_id column referring as a foreign key to users.accounts.id. (I do realise that to make it work both databases must use InnoDB and reside on the same server.)

My question is what would be a better practice:

direct reference to another database:
- simply refer blogs.posts.account_id to users.accounts.id (repeat with all other blogs.* tables),
- make each reference CASCADE ON DELETE; or
using an intermediary table:
- create an intermediary table blogs.accounts having only one column called id; then
- on one hand, refer every table inside the blogs database to that intermediary table (so blogs.posts.account_id to blogs.accounts.id, CASCADE ON DELETE); and
- on the other hand, finish by referring this blogs.accounts.id to the 'upstream' users.accounts.id, make sure to CASCADE ON DELETE as well.

The latter seems like an unnecessary complication. But the only advantage I can think of is this can make the setup future proof in case we end up having to still migrate one (or some) of the databases to another server:

If we link the tables directly, after the migration the blogs database will have lots of disparate account_id columns that won't CASCADE ON DELETE
But if these intermediary tables get disconnected from the upstream users.accounts.id, their neighbouring tables in each respective database are still linked to them. This way we can continue benefitting from at least somewhat integrity and CASCADEs. In other words, if a user gets deleted, all we have to do is have a script go through each of these *.accounts connector tables and delete the id counterpart once, and CASCADE will take care of the rest of the tables inside of that database automatically.

Am I on the right track with this logic, or am I missing some other ways to handle this more effectively, and therefore reinventing the wheel?

Best Answer

This is bordering on an opinion-based question, because of course anyone can have their own justification for either choice.

I would not choose to add the intermediary table, for a few reasons.

One is that as you know, the intermediary table cannot have a foreign key reference to a table on another MySQL instance. Foreign keys can span schemas, but not instances.

If you need to run special scripts to clean up the intermediary table anyway, that special script should be able to encode clean up tasks for all the blog tables. I assume at least the number of blog tables remains relatively stable. You aren't going to be adding so many blog tables in custom ways that this becomes a burden to maintain the cleanup script.

The intermediary table therefore seems like an extra step that just complicates things, and takes extra storage.

I've worked at several companies that eschewed the use of foreign key constraints in any case, because they make some operation tasks harder, for example schema changes. Also because they incur locking behavior that you probably don't expect. In the absence of the constraints, you can't get the ON DELETE CASCADE feature.

Like I said above, this is subject to opinion. You could easily ask, "but what if the priorities are a bit different...?" Of course, you must make this decision based on your requirements and priorities.

Related Solutions

Mysql – Linking identifiers across tables in MySQL (Foreign Keys)

Is there a way to tell MySQL that the ContactPhone.ContactID field is referencing the Primary Key of Contact?

Yes, use a foreign key.

Can there be automatic handling? If the Contact is deleted, then relevant ContactPhone.ContactID should be set to zero.

Yes, add in an action after ON DELETE in the below code.

Here's an example:

create table ContactPhone(
  ContactID int not null, 
  PhoneNumber text not null,
  CONSTRAINT `fk_contactid`      -- this is the foreign key
    FOREIGN KEY (`ContactID` )
    REFERENCES `Contact`(`id` )
    ON DELETE <some action>      -- the on delete action goes here
) engine = InnoDB;               -- make sure to use InnoDB if you want FKs to be enforced

Make sure that either InnoDB is specified as the storage engine, or that your version of MySQL defaults to InnoDB. Other storage engines will parse and silently ignore foreign key constraints.

Here's the grammar for a MySQL foreign key:

[CONSTRAINT [symbol]] FOREIGN KEY
    [index_name] (index_col_name, ...)
    REFERENCES tbl_name (index_col_name,...)
    [ON DELETE reference_option]
    [ON UPDATE reference_option]

reference_option:
    RESTRICT | CASCADE | SET NULL | NO ACTION

Postgresql – How to change schema so that account_id reference is unique among 3 tables

Start with:

Create a ChartOfAccounts table with the Account code as Primary Key.
Add a Foreign Key constraint to ChartOfAccounts on all tables with an AccountCode field.
Use an IsDebit field, not the numeric sign, to distinguish Debits from Credits and reserve negative signs for transaction reversals (if used at all). This is necessary in order to generate T-Balances and Trial Balances properly from your Journal and Ledger
Create a Journal table with Primary Key: TransactionType, PostingDate, Account, SubledgerCode, IsDebit and minimum attributes of: Amount, CreatedDate, CreatedBy, DocumentReference
Design and spec a stored procedure (or type of routine suitable for where your business logic is located) for each type of transaction to be handled by the system. For your system these might be:
- Ticket Purchase for Cash
- Ticket Purchase on Account
- Prize Payout in Cash
- etc.

Please note that I am CGA, CPA in addition to being primarily a professional developer.

Update - Terminology:

A Journal is a chronological list of the details of all transactions of a given type, such as Cash Receipts, Cash Disbursements, Sales, etc.
A Ledger is a listing By Account of the aggregates of all transactions in a given time period.

It is occasionally necessary or expedient, when a wide variety of transaction types will be supported by the system (or to increase parallelism, as when many clerks need to be working at once Bob Cratchit style), to have multiple Journal files with different structure.

In a modern SQL Server system with only one Journal the Ledger could be defined as an Indexed View on the Journal. This would eliminate the need for either a trigger on the Journal to update the Ledger, or a batch-processing design.

Also, it is acceptable to have separate DrAmount and CrAmount columns in place of an IsDebit flag and single Amount column.

Best Answer

Related Solutions

Mysql – Linking identifiers across tables in MySQL (Foreign Keys)

Postgresql – How to change schema so that account_id reference is unique among 3 tables

Related Question