Mysql – If foreign keys/cascade deletes are bad, why use a database-server with that feature

foreign keyMySQLnosqlreferential-integrity

I noticed wordpress/rails etc do not use foreign keys constraint or cascade deletes features from database. Instead they handle this in PHP/Ruby/scripting level!

I have read this and this. Most arguments against foreign keys constraint talks about performance, multithreading, locking, scalability, etc.

Assuming arguments against foreign keys are valid, my question are:

If foreign keys are bad, why WordPress/Rails/etc uses a sql-server which supports foreign keys? Will they benefit going away from MySQL to a NoSQL kind of server?
On other hand, can applications coded in a way to make use of foreign keys feature without running into issues?
Is noSQL/redis better if we are using database only for storage and managing "relations" at application/script layer?

Best Answer

Lets start with your first link. It says clearly:

Working on large databases with referential integrity just does not perform well.

And that is right. Just you likely have no clue that "large database" is terabyte size with billions of rows in a table. A simple select may cascade into hundreds of millions of related elements to be deleted, and then you have a performance problem.

This is a non-issue for regular small databases such as a wordpress log or most CMS - it turns into a problem if you do something like facebook, or handle financial simulation data. I regularly deal with multi billion row tables and deletes that work in stored procedures outside of transactions in batches of x - because the end delete may easily clean up some hundred million rows.

Will they benefit going away from MySQL to a NoSQL kind of server?

Hardly. They are useful when a professional uses them as appropriate.

On other hand, can applications coded in a way to make use of foreign keys feature without running into issues?

Yes.

Is noSQL/redis better if we are using database only for storage and managing "relations" at application/script layer?

I once did a application review for a technology upgrade in a bank that uses no referential integrity (to get the performance up). Loading the data into SQL Server (which was supposed to replace their aging Adabas installation) it failed with integrity constraint violations. Happens 40% of the historical records were invalid because some * had deleted lookup table values not in use any more (such as old customer classification codes, which got replaced for all active customers, just not the old ones). No referential integrity warning ever came up. The result was some fired people and a problem stuck in workaround and a partially useless data warehouse build on top.

So much for managing relations at application / script layer. Errors WILL happen. Data is valuable, applications change.

Most people complaining about SQL level features would be more advised to read a book about them, and try to understand them, instead of complaining. Sadly a lot of the advice on the internet is written by people that refuse to read even documentation. Always be careful with that. Most advice to NoSql is based strongly on ignorance.

Related Solutions

Mysql – Tablestructure for fast inserts/deletes with foreign keys

First, if your tables are InnoDB, and column c.P_ID is the reference of column p.P_ID (note actual column names may be different), then you absolutely should use foreign keys to avoid orphans, but know that you will have to explicitly state ON DELETE CASCADE, since the default is ON DELETE SET NULL, retaining orphans.

But this doesn't really have anything to do with the 'single query' to select the products and information about the competitors.

To achieve:

I want to achieve, that with only one query i got the product information from p an also the competitor ones from c where the product id is the same.

your query would use a join:

SELECT products.P_ID, products.product_name, competitors.info
 FROM products
 LEFT JOIN competitors ON competitors.P_ID=products.P_ID

This will get you repeat information for the products (competitor info will be different). And then you will use your application to parse that info into the array you need.

If you want a single row for the products, you can use the GROUP_CONCAT function for the competitors:

SELECT products.P_ID, products.product_name, 
  GROUP_CONCAT(competitors.info SEPARATOR ',') as competitor_info
 FROM products
 LEFT JOIN competitors ON competitors.P_ID=products.P_ID
 GROUP BY products.P_ID;

A major caveat you should be aware of: ON UPDATE|DELETE CASCADE does not fire triggers on the rows that were deleted/updated by the cascade:

Note

Currently, cascaded foreign key actions do not activate triggers.

This shouldn't stop you from using CASCADE if you want to remove orphan rows. But you should be aware of it.

MySQL – Why Use Foreign Keys if Cascade Action is Not Needed?

The basic reason in my opinion for using constraints - and not only foreign key ones - is to enforce integrity at the database level and do not rely on the possibly several applications that will use the database in its entire lifetime.

Defining that FOREIGN KEY enforces that every photo - to be inserted - has to be related to an existing user.
Defining NO ACTION enforces that a user cannot be deleted if any related photos are not deleted first.
Defining ON CASCADE DELETE enforces that the photos are deleted along with the user's deletion.
Defining a UNIQUE constraint on username does not allow two users with same name, etc...

In all of the above, the DBMS is entitled to maintain the integrity according to the constraints defined and not allow the database to fall into an inconsistent state.

Best Answer

Related Solutions

Mysql – Tablestructure for fast inserts/deletes with foreign keys

MySQL – Why Use Foreign Keys if Cascade Action is Not Needed?

Related Question