Mysql – Tablestructure for fast inserts/deletes with foreign keys

database-designforeign keyinnodbMySQLperformance

Current Situation
We have a table called c with ca. 300,000 rows. In this table we store the competitors for a specific product.
Example:

id | competitors | some infos about competitor        | product  
---------------------------------------------------------------
1  | C_1         | xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx | P_1  
2  | C_2         | xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx | P_1  
3  | C_3         | xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx | P_1  
4  | C_1         | xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx | P_2  
5  | C_4         | xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx | P_2

Every 30 minutes we read in the new competitors. To do so we first delete everyone with the specific product (P_1) and insert afterwards the new ones. So there will never be a competitor in our DB, which isnt at the moment listed for this product.

When we got the new infos, we do some kind of research with the products.
(We also have a table called p with ca. 100,000 rows. In that table we store the infos about the products.)

So back to the research process. We run through every product an have to get the competitors with their infos. For this purpose we have do do a mysql request like SELECT * FROM c WHERE product LIKE "P_1";. Not to say that this is not very performant.

My Goal
I want to achieve, that with only one query i got the product information from p an also the competitor ones from c where the product id is the same.
So that these infos get stored in an array like this:

[0]['id'] => '2'  
[0]['product_identifier'] => 'P_1'  
[0]['c'][0]['competitors'] => 'C_1'  
[0]['c'][1]['competitors'] => 'C_2'  
[0]['c'][3]['competitors'] => 'C_3'

Possible Solution
To get the needed information with just one query, i came up with this solution.

In table c the product row contains a foreign key which links to the p table. That would be very easy to realise. But I am asking myself if it would be possible to delete c or p rows easaly. I never worked that intensive with foreign keys – seemed to me always a bit error generating.

And I also wonder if I could make a join that easy, cause I want from p the related competitors and not the other way around. You know what I mean? Not sure if I can do that this way. And if I could, how to tread this foreign keys. What do I have looking for in order to not get messed up with an unclean database.

Example
We will use our table c from above, where c.product stores the foreign key to our table p.product_identifier. Like this:

id | product_identifier | some infos about the product          
---------------------------------------------------------------
1  | P_1                | xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx  
2  | P_2                | xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx  
3  | P_3                | xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

I am not sure how this works out or if it's even a good idea, but how has my query look like? If i want to SELECT every product from p and also their competitors from c?

Best Answer

First, if your tables are InnoDB, and column c.P_ID is the reference of column p.P_ID (note actual column names may be different), then you absolutely should use foreign keys to avoid orphans, but know that you will have to explicitly state ON DELETE CASCADE, since the default is ON DELETE SET NULL, retaining orphans.

But this doesn't really have anything to do with the 'single query' to select the products and information about the competitors.

To achieve:

I want to achieve, that with only one query i got the product information from p an also the competitor ones from c where the product id is the same.

your query would use a join:

SELECT products.P_ID, products.product_name, competitors.info
 FROM products
 LEFT JOIN competitors ON competitors.P_ID=products.P_ID

This will get you repeat information for the products (competitor info will be different). And then you will use your application to parse that info into the array you need.

If you want a single row for the products, you can use the GROUP_CONCAT function for the competitors:

SELECT products.P_ID, products.product_name, 
  GROUP_CONCAT(competitors.info SEPARATOR ',') as competitor_info
 FROM products
 LEFT JOIN competitors ON competitors.P_ID=products.P_ID
 GROUP BY products.P_ID;

A major caveat you should be aware of: ON UPDATE|DELETE CASCADE does not fire triggers on the rows that were deleted/updated by the cascade:

Note

Currently, cascaded foreign key actions do not activate triggers.

This shouldn't stop you from using CASCADE if you want to remove orphan rows. But you should be aware of it.

Related Solutions

Sql-server – How to get the foreign and primary key details for a table in SQL Server 2008

You should become familiar with the new catalog views. The old sysobjects, syscolumns etc. have been deprecated and are only provided for backward compatibility reasons - they shouldn't be used for new development and old code should eventually transition to the new objects.

Here is a query that will get both sides of the columns involved in foreign keys:

SELECT 
        constraint_name   = OBJECT_NAME(fkc.constraint_object_id), 
        foreign_schema    = SCHEMA_NAME(child.[schema_id]), 
        foreign_table     = child.name, 
        foreign_column    = child_cols.name, 
        referenced_schema = SCHEMA_NAME(referenced.[schema_id]),
        referenced_table  = referenced.name, 
        referenced_column = referenced_cols.name
    FROM sys.foreign_key_columns AS fkc
    INNER JOIN sys.tables AS child
    ON fkc.parent_object_id = child.[object_id]
    INNER JOIN sys.tables AS referenced
    ON fkc.referenced_object_id = referenced.[object_id]
    INNER JOIN sys.columns AS referenced_cols
    ON fkc.parent_column_id = referenced_cols.column_id
    AND referenced_cols.[object_id] = referenced.[object_id]
    INNER JOIN sys.columns AS child_cols
    ON fkc.referenced_column_id = child_cols.column_id
    AND child_cols.[object_id] = child.[object_id]
    WHERE referenced.name = 'table name'
    -- if you want the tables that a child table references, use this instead:
    -- WHERE child.name = 'table name'
    ORDER BY fkc.parent_column_id;

And this one is much simpler, just for the tables involved:

SELECT 
    constraint_name = name, 
    foreign_schema = OBJECT_SCHEMA_NAME(parent_object_id),
    foreign_table = OBJECT_NAME(parent_object_id), 
    referenced_schema = OBJECT_SCHEMA_NAME(referenced_object_id),
    referenced_table = OBJECT_NAME(referenced_object_id)
FROM sys.foreign_keys
WHERE OBJECT_NAME(referenced_object_id) = 'table name'
-- if you want the tables that a child table references, use this instead:
-- WHERE OBJECT_NAME(parent_object_id) = 'table name';

EDIT adding some links to documentation (these mostly fall within the SQL Server 2008 R2 hierarchy but they're roughly equivalent for SQL Server 2008 as well):

Mysql – indexes, foreign keys and optimization

In the example you show, the expression would use the index on criteria to resolve which rows to retrieve, and the leaf nodes in the index contain references to the corresponding rows, either by physical location in the case of MyISAM or by primary key value in the case of InnoDB. That way it avoids a full table scan.

Note that if the query optimizer thinks your query matches too much of the table, it will ignore the index, because it's just as fast to do the table-scan. By analogy, this is why no one bothers to put the word "the" in the index at the back of a book.
No, MySQL requires that an index exists for the foreign key. If an index already exists, InnoDB will use it, but it will create one if necessary. I have found some corner cases where you can trick InnoDB by dropping the index after defining the foreign key constraint, but this would not be advisable, because the index is important for checking referential integrity quickly.
It's true that insert/update/delete needs to update indexes to reflect values added or removed. But it's not as bad as you might fear. Also, InnoDB implements a "change buffer" which records how your update affects secondary indexes, and reconciles it with the whole index later.
InnoDB and MyISAM support only BTREE indexes. InnoDB also has an internal feature called the adaptive hash which does do a hash lookup for frequently-requested index entries. But this is completely automatic and transparent; you don't have to define any index or tune any config parameters.

Best Answer

Related Solutions

Sql-server – How to get the foreign and primary key details for a table in SQL Server 2008

Mysql – indexes, foreign keys and optimization

Related Question