Sql-server – the benefit of two FK, when one of them can be deduced from another table

foreign keysql server

The third party Microsoft SQL Server database I'm working on is using the structure which can be illustrated with this example:

Let's take three tables:

Shop which is the topmost level and corresponds, for example, to an e-commerce website, given that several websites are using the same database,
Category which is a logical category of products within a Shop,
Product which belongs to a Category (given that the category is mandatory, so there should be no products which don't belong to any category).

Category has one foreign key to Shop. The Product has two foreign keys: one to Category, another to Shop.

If I were designing a similar database, I would have put only one foreign key to Product, linking it only to a Category. IMO, it:

Simplifies the schema,
Avoids the risk of inconsistent state, where product 1 belongs to category 1 and site 2, but category 1 itself belongs to site 1,
(Avoids redundant information to waste the place in the database),
Doesn't make it particularly more difficult to query data. Even if there are cases (for example a search) where the website would need products without carrying too much about the categories, but still considering to what shop they belong, making two joins instead of one is not a big deal.

Why was this third party database done in the way it's done? Are there benefits from this approach?

Best Answer

Category has one foreign key to Shop. The Product has two foreign keys: one to Category, another to Shop.

There should be a single foreign key referencing two columns in categories.

create table products (
  ...
  primary key (shop_id, category_id, product_id),
  foreign key (shop_id, category_id) 
    references categories (shop_id, category_id)
);

Think about it this way. Conceptually, normalization starts with a single relation that contains all the attributes.

shop_name  category_name    product_name
--
Wibble     Chain saws       Stihl 350
Wibble     Chain saws       Poulan 3X
Wibble     Pole saws        Black & Decker 14 foot electric
Wibble     Pole saws        Corona 15 foot

Thursby    Pole saws        Hitachi electric
Thursby    Pole saws        Remington electric
Thursby    Chain saws       Husqvarna 460
Thursby    Chain saws       Poulan 3x

It should be clear that the only candidate key is {shop_name, category_name, product_name}. It's in 5NF. There's no redundant data.

Replacing text with ID numbers won't improve that.

A design that requires you to chase ID numbers through a hierarchy of tables is an anti-pattern. It's like building an IMS database in SQL. (IMS was a problem the relational model intended to solve.)

Related Solutions

Sql-server – SQL to select random mix of rows fairly

A clustered index seek or scan could be improved to a non-clustered index seek or scan which should be more efficient.

Since it looks like your problem is Products, I would see about adding an index which would be covering on that table (or perhaps an indexed view since you already have:

Id ManufacturerId Active MemberPrice

Because some of your other columns don't have prefixes, I can't tell where they come from, but I expect some of them also come from Products, so this might not be feasible to make this index covering.

However, but having Active and MemberPrice in the non-clustered index, this might help. It might be enough to tip the plan in favor of a NCI with a lookup to the clustered index to get the remaining columns (like FamilyImageName)

Mysql – Tablestructure for fast inserts/deletes with foreign keys

First, if your tables are InnoDB, and column c.P_ID is the reference of column p.P_ID (note actual column names may be different), then you absolutely should use foreign keys to avoid orphans, but know that you will have to explicitly state ON DELETE CASCADE, since the default is ON DELETE SET NULL, retaining orphans.

But this doesn't really have anything to do with the 'single query' to select the products and information about the competitors.

To achieve:

I want to achieve, that with only one query i got the product information from p an also the competitor ones from c where the product id is the same.

your query would use a join:

SELECT products.P_ID, products.product_name, competitors.info
 FROM products
 LEFT JOIN competitors ON competitors.P_ID=products.P_ID

This will get you repeat information for the products (competitor info will be different). And then you will use your application to parse that info into the array you need.

If you want a single row for the products, you can use the GROUP_CONCAT function for the competitors:

SELECT products.P_ID, products.product_name, 
  GROUP_CONCAT(competitors.info SEPARATOR ',') as competitor_info
 FROM products
 LEFT JOIN competitors ON competitors.P_ID=products.P_ID
 GROUP BY products.P_ID;

A major caveat you should be aware of: ON UPDATE|DELETE CASCADE does not fire triggers on the rows that were deleted/updated by the cascade:

Note

Currently, cascaded foreign key actions do not activate triggers.

This shouldn't stop you from using CASCADE if you want to remove orphan rows. But you should be aware of it.

Best Answer

Related Solutions

Sql-server – SQL to select random mix of rows fairly

Mysql – Tablestructure for fast inserts/deletes with foreign keys

Related Question