MySQL – Column value depends on values from two different tables

database-designMySQL

I have a MySQL database that contains these tables (* denotes primary key):

TableA:

----------------------------------------------
|  *id  |  time  |  price  |  special_price  |
----------------------------------------------

TableB:

-------------------------------------
|  *table_a_id  |  *date  |  price  |
-------------------------------------

SpecialDate:

--------------------------
|  *date  |  description |
--------------------------

The value of TableB.price column of each entry in TableB will be either TableA.price or TableA.special_price of the corresponding TableA entry, depending on whether TableB.date is a special date or not (whether TableB.date exists in the SpecialDate table).

This design, as it stands, doesn't feel right as every time I'm adding a record to TableB, I have to look up the date from SpecialDate to see if it's a special date or not, then look up the corresponding entry from TableA to get the value for the price column. And every time a record from SpecialDate gets added or updated, I have to go over every entry in TableB to update its price column.

I'm thinking the column TableB.price should be removed and calculated when needed but I'm not sure how to do that using SQL. I have a lot of views that depend on that price column, for example:

TableAFullView:

-----------------------------------------------------------------------------
|  id  |  time  |  count_of_table_b_instances  |  sum_of_table_b_instances  |
-----------------------------------------------------------------------------

Where count_of_table_b_instances is the number of TableB entries that point to this TableA entry and sum_of_table_b_instances is the sum of their prices.

With my original design, this view is simple to implement and it performs well. But if I get rid of the column TableB.price, how would I implement this view? Will the performance take a hit?

Best Answer

With my original design, this view is simple to implement and it performs well. But if I get rid of the column TableB.price, how would I implement this view

create table TableA (
id int,
v_time date,
price int,
special_price int
);

create table TableB (
table_a_id int,
v_date date
);

create table SpecialDate (
v_date date,
description varchar(5)
);

insert into TableA values (1,'2019-09-01', 10, 5);
insert into TableB values (1,'2019-10-01');
insert into TableB values (1,'2019-11-01');

insert into TableA values (2,'2019-12-01', 50, 1);
insert into TableB values (2,'2019-12-02');
insert into TableB values (2,'2019-12-03');
insert into TableB values (2,'2019-11-01');

insert into SpecialDate values('2019-11-01', 'sale');

--your view definition

select a.id,
   a.v_time,
   count(*) as count_of_table_b_instances,
   sum(
      case
     when s.v_date is not null then a.special_price
     else a.price
     end
   ) as sum_of_table_b
 from TableA a
 join TableB b
   on a.id = b.table_a_id
 left join SpecialDate s
  on b.v_date = s.v_date
 group by a.id,
      a.v_time

Will the performance take a hit?

I can't say for sure because I can't assume you have the right indexes setup to support this query without seeing the DDL of the tables. If you have indexes in place to support the join clauses, or if you don't have much data in these tables, I don't think you would take a performance hit.

Related Solutions

Mysql – Storing prices with price history and different specificity (global, user group, single user) in MySQL database

There are a couple of ways you could separate the current values from the historic ones: you could simply include a boolean field that is true for the latest price and filter on that, or you could keep current prices in a separate table. Both options involve a little extra work to maintain integrity but would make a query for current prices more efficient. This doesn't remove the need for the sub-query to pick the first row that matches the user/group/NULL condition, though fewer rows will be scanned in processing both the inner and outer queries.

The following might be more efficient as a straight query if only the current prices are in the base table (add the is_latest=true check or your current date filter to the where and on clauses if you keep the historic values in the same table):

SELECT CASE WHEN usermatch.id IS NOT NULL THEN usermatch.field1 WHEN groupmatch.id IS NOT NULL THEN groupmatch.field1 ELSE nullmatch.field1 END AS field1
     , CASE WHEN usermatch.id IS NOT NULL THEN usermatch.field2 WHEN groupmatch.id IS NOT NULL THEN groupmatch.field2 ELSE nullmatch.field2 END AS field2
     (... and so on ...)
     , CASE WHEN usermatch.id IS NOT NULL THEN usermatch.fieldN WHEN groupmatch.id IS NOT NULL THEN groupmatch.fieldN ELSE nullmatch.fieldN END AS fieldN
FROM   base_prices AS nullmatch
LEFT OUTER JOIN 
       base_prices AS groupmatch ON groupmatch.id=nullmatch.id AND nullmatch.referent_id = @GroupIDHere
LEFT OUTER JOIN 
       base_prices AS usermatch ON usermatch.id=nullmatch.id AND nullmatch.referent_id = @UserIDHere
WHERE nullmatch.referent_id IS NULL

This will only work if you always have a row for each product where referent_id is null, otherwise you could try:

SELECT CASE WHEN usermatch.id IS NOT NULL THEN usermatch.field1 WHEN groupmatch.id IS NOT NULL THEN groupmatch.field1 ELSE nullmatch.field1 END AS field1
     , CASE WHEN usermatch.id IS NOT NULL THEN usermatch.field2 WHEN groupmatch.id IS NOT NULL THEN groupmatch.field2 ELSE nullmatch.field2 END AS field2
     (... and so on ...)
     , CASE WHEN usermatch.id IS NOT NULL THEN usermatch.fieldN WHEN groupmatch.id IS NOT NULL THEN groupmatch.fieldN ELSE nullmatch.fieldN END AS fieldN
FROM   users
CROSS JOIN 
       base_prices AS nullmatch
LEFT OUTER JOIN 
       base_prices AS groupmatch ON groupmatch.id=nullmatch.id AND nullmatch.referent_id = users.groupid
LEFT OUTER JOIN 
       base_prices AS usermatch ON usermatch.id=nullmatch.id AND nullmatch.referent_id = users.id
WHERE users.id = @UserIDHere
AND   nullmatch.referent_id IS NULL

In both cases be careful using this construct in a view because further filtering by the fields brought out via those case statements will not be able to use any indexes you may have defined.

Obviously the case statement per field is a bit of faf but it should stop the query runner searching the table once for every row returned as it might with the sub-query arrangement.

In other databases you might be able to use windowing functions (like ROW_NUMBER) to do this more conveniently and efficiently, but mySQL does not support those to my knowledge.

Design Questions

Build your database in stages. You have, so far, three main entities: stores, suppliers, items. The relationship between suppliers and items seems simple enough. A supplier may supply many items and an item may be available through many suppliers. This implies an intersection/junction table between Suppliers and Items. As part of this relationship, the default price, the price in effect if no other arrangements are in effect, can be included.

create tabel SupplierItem(
    SupplierID   int not null,
    ItemID       int not null,
    DefaultPrice currency not null,
    constraint   PK_SupplierItem primary key( SupplierID, ItemID ),
    constraint   FK_SupplierItem_Supplier foreign key( SupplierID )
        references Suppliers( ID ),
    constraint   FK_SupplierItem_Item foreign key( ItemID )
        references Items( ID )
);

Now you have two ways that prices may be adjusted. But let's introduce a third discount simply because it is so common -- the volume-based discount. That can be seen as an attribute of the relationship between Supplier and Item.

create table VolumeDiscount(
    SupplierID   int not null,
    ItemID       int not null,
    MinVol       int not null,
    Discount     tinyint not null check( Discount between 0 and 99 ),
    constraint   PK_VolumeDiscount primary key( SupplierID, ItemID, MinVol )
);

I've omitted the FK definitions that are the same as already defined. Discount is a percentage reduction in price based on a volume >= MinVol and < the next highest MinVol listing. Discount might better be defined as a floating point value -- but this is, after all, for illustration only.

Now let's take the optional geographically-based discounts. This suggests yet another intersection table. It also assumes a table of geographic locations.

create table GeoDiscount(
    SupplierID   int not null,
    GeoID        int not null,
  -- ItemID       int not null,
    Discount     tinyint not null check( Discount between 0 and 99 ),
    constraint   PK_GeoDiscount primary key( SupplierID, GeoID[, ItemID] ),
    constraint   FK_GeoDiscount_Geo foreign key( GeoID )
        references GeoLocations( ID )
);

As you can see, this is easily extendable if there are different discounts for each item, though I don't see that needed very often.

Finally, there are discount agreements between some suppliers and some stores. This can get tricky in that a supplier can give a store a blanket discount for all its items and/or individual discounts on certain items.

create table StoreDiscounts(
    StoreID     int not null,
    SupplierID  int not null,
    ItemID      int,
    Discount    tinyint not null check( Discount between 0 and 99 ),
    constraint  FK_StoreDiscounts_Store foreign key( StoreID )
        references Stores( ID ),
    constraint  UQ_StoreDiscounts unique( StoreID, SupplierID, ItemID )
);

Notice that the natural key (StoreID, SupplierID, ItemID) is defined with a unique constraint rather then a primary key constraint. That is because ItemID is nullable. If ItemID is NULL, the discount applies to all item from the supplier bought by the store. Instead, or in addition, the supplier may offer discounts on a per-item basis. Or one general discount for all items (ItemID is NULL) but different discounts on a select subset of items. In the last case, the per-item discounts may be added to the general discount or it may replace the general discount. There is nothing intrinsic to the design that requires or emphasizes one or the other. Which scheme to use will have to be documented. Whether these discounts replace or are added to the standard volume discounts would be another business rule to be determined outside of this schema design.

This is an untested design, of course, and that last table can probably be improved upon. But the design is fairly "tight" as far as data integrity is concerned and is trivially normalized to BCNF. The queries to work this design will, as may be obvious, fairly complex but that will always be the cost of such desired flexibility. Your job is to make the user's job easier, not your own.

Best Answer

Related Solutions

Mysql – Storing prices with price history and different specificity (global, user group, single user) in MySQL database

Design Questions

Related Question