Mysql – JOIN 2 Tables on 1 Table and Select their separate quantity

MySQL

Is there any other way i can display the quantity field i have from 2 different tables where their ID's are referenced on 1 table.

Example:

shipment_box table, columns: shipment_box_id, box_code, quantity, date_created

shipment_consumable table, columns: shipment_consumable_id, shipment_box_id, item_id, date_added

shipment_unit table, columns: shipment_unit_id, shipment_box_id, item_id, date_added

The requirements would be:

  • The user must be able to create a shipment box.
  • The created shipment box must be able to accept a mix of unit and consumable items.
  • The report generated must display all shipment box's contents and quantity per item_code.

Below would be a visual representation of my database tables:
item table

shipment_box table

shipment_unit table

shipment_container table

my initial idea was to create a separate SELECT Query for shipment_unit and shipment_consumable joining to shipment_box but if there's a way i can do this around without having to UNION these two queries, it would be more ideal.

below is my draft query (not tested yet):

SELECT
    sb.box_code,
    i.item_code,
    COUNT(su.shipment_unit_id) 'Qty'
FROM
    shipment_box sb
JOIN
    shipment_unit su
ON
    su.shipment_box_id = sb.shipment_box_id
JOIN
    item i
ON
    i.item_id = su.item_id
GROUP BY
    sb.box_code,
    i.item_code

UNION

SELECT
    sb.box_code,
    i.item_code,
    COUNT(sc.shipment_unit_id) 'Qty'
FROM
    shipment_box sb
JOIN
    shipment_consumble sc
ON
    sc.shipment_box_id = sb.shipment_box_id
JOIN
    item i
ON
    i.item_id = sc.item_id
GROUP BY
    sb.box_code,
    i.item_code

i was thinking of something similar to a hypothetical query like below:

SELECT
    sb.box_code,
    i.item_code, -- Display item_code from joined item table
    IF(COUNT(su.shipment_unit_id) = 0, COUNT(sc.shipment_consumable_id)) 'Qty' -- Display quantity of shipment_unit and shipment_consumable items
FROM
    shipment_box sb
JOIN
    shipment_unit su
ON
    su.shipment_box_id = sb.shipment_box_id
JOIN
    shipment_consumble sc
ON
    sc.shipment_box_id = sb.shipment_box_id
JOIN
    item i
ON
    (
        i.item_id = su.item_id
    OR
        i.item_id = sc.item_id
    )
GROUP BY
    sb.box_code,
    i.item_code

i am open with a lot of ideas that can help me please. Thank you.

Best Answer

First, let's sum up a few assumptions:

  • It appears that a given item cannot be both a consumable and a shipment unit, unless item_code is not a unique value in item.
  • The shipment_unit table is a simple list of all "unit" items in all shipment boxes; shipment_consumable is (similarly) a list of all "consumables" in all shipment boxes.
  • Both shipment_unit and shipment_consumable have no quantity column of their own; so, if a given shipment box was supposed to have (say) 1 Windows phone and two wired headsets, you would have to have two rows in shipment_consumable for the box, with the same shipment_box_id and item_id.
  • You are looking for the items and quantities in 1 instance of a shipment_box, rather than the total items and quantities in the total quantity of the shipment_box. That is to say, if shipment_box.quantity is 17, and the box contains 2 Windows phones, you want 2 as the count, not 34 (2 * 17).

First - I must say that there appears to be absolutely no reason to have both shipment_unit and shipment_consumable. The tables appear to have the exact same structure, and foreign keys to the exact same other tables (shipment_box and item). You know whether a given item is consumable or a "unit" form the item table, so there's no good reason I can see to separate the two things from each other. I feel obligated to point this out, as I suspect your design would be much simpler with just one table (shipment_content, perhaps).

Similarly, having a quantity field for each item in a shipment_unit (or _consumable, or _content) seems like an obvious plus; barring different values for date_added, there's no good reason to have:

 shipment_unit_id | shipment_box_id | item_id |   date_added
------------------+-----------------+---------+------------------
         7        |        4        |    1    | 2017-10-11 00:00
         8        |        4        |    1    | 2017-10-11 00:00

in order to get two Windows phone in the same shipment_box. This may not seem likely at this point, but to me it seems like a trivially predictable need for the future. That said, some would say if you don't need it as part of the current plans, leave it out rather than cluttering your design with what could become a bunch of unused fields.


That said, we don't always have the option to choose certain elements of the design - you may be stuck with this table layout, through no fault of you own.

So, if this is how it all works, and if it's how it must work, how do you get the info you need? I think your draft query would work accurately, and your JOIN query would not (except for degenerate cases, where there's 0 or 1 rows for a given shipment_box_id in each of the other two tables).

As written, there's one potential issue with your JOIN query - it may not be relevant, but it should be noted. You are performing an INNER JOIN to both ship_unit and ship_consumables. If a shipment_box only had "unit"-type items in it, then you would get back no data for it - because as written, there must be at least one "unit" and at least one consumable for a shipment_box to be displayed.

To avoid this, we should make the joins LEFT JOINs, so we get a shipment_box even if it's only got units or consumables, but not both (Note that by doing this, we'll also get a shipment_box that has no items attached. If necessary, we can filter those out later.)

The next thing to understand is how joins work. Each row in the underlying result set will have the fields from shipment_unit, and the fields from shipment_consumable. If you'll always have at most 1 row from each table, things will work as you expect; otherwise, they won't.

Let's posit a shipment_box (shipment_box_id = 5, box_code = 'BX-255', quantity = 10) with the following contents:

  • 1 Windows phone (unit)
  • 1 wired headset (consumable)
  • 1 wireless headset (consumable)

So, shipment_unit has:

 shipment_unit_id | shipment_box_id | item_id |   date_added
------------------+-----------------+---------+------------------
         10       |        5        |    2    | 2017-10-12 00:00

and shipment_consumable has:

 shipment_consumable_id | shipment_box_id | item_id |   date_added
------------------------+-----------------+---------+------------------
           14           |        5        |    3    | 2017-10-12 00:00
           15           |        5        |    4    | 2017-10-12 00:00

If we join shipment_box, shipment_unit, and shipment_consumable as you indicate, we get the following (I've left out the shipment_box columns, and the shipment_box_id and date_added from the other two tables, for readability):

 shipment_unit_id | su_item_id | shipment_consumable_id | sc_item_id
------------------+------------+------------------------+------------
         10       |     2      |           14           |     3 
         10       |     2      |           15           |     4 

Basically, since there's no criteria to tie records from shipment_unit and shipment_consumable together, we get the cross-product of the records from each of those table that match our current shipment-box.

When we go to match item rows to what weve already got, we see that each row will match two items - the shipment_unit item, and the shipment_consumable item. So, with item_code attached, we've got this:

 shipment_unit_id | su_item_id | shipment_consumable_id | sc_item_id | item_code
------------------+------------+------------------------+------------+-----------
         10       |     2      |           14           |     3      |  ITM-002
         10       |     2      |           14           |     3      |  ITM-003
         10       |     2      |           15           |     4      |  ITM-002
         10       |     2      |           15           |     4      |  ITM-004

So, when we group by item_code there are two rows with 'ITM-002', and we'll get a count of 2 for that item (even though only one is in the shipment_box).

There are a couple of ways to work around this.

Since, as I noted above, there's really no reason to have separate shipment_unit and shipment_consumable table, you can use a union of the two tables within your existing query, instead of getting the UNION of two completely different queries:

SELECT
    sb.box_code,
    i.item_code,
    COUNT(*) `Qty` -- Display quantity of items
FROM
    shipment_box sb
JOIN
    (SELECT shipment_box_id, item_id FROM shipment_unit
     UNION ALL
     SELECT shipment_box_id, item_id FROM shipment_consumable
    ) AS box_item
ON
    sb.shipment_box_id = box_item.shipment_box_id
JOIN
    item i
ON
    box_item.item_id = i.item_id
GROUP BY
    sb.box_code,
    i.item_code
ORDER BY
    box_code,
    item_code
;

This way, in our example, the subquery would generate three rows - 1 from shipment_unit, and 2 from shipment_consumable.

Of course, that solution still includes a UNION. So, if you don't like that, then we can ignore counting the items from our result set altogether, and make simple separate queries to get the counts:

SELECT
    sb.box_code,
    i.item_code, -- Display item_code from joined item table
    (SELECT COUNT(*)
       FROM shipment_unit su2 INNER JOIN item i2 ON (su2.item_id = i2.item_id)
      WHERE su2.shipment_box_id = sb.shipment_box_id
        AND i2.item_code = i.item_code
    ) +
    (SELECT COUNT(*)
       FROM shipment_consumable sc2 INNER JOIN item i2 ON (sc2.item_id = i2.item_id)
      WHERE sc2.shipment_box_id = sb.shipment_box_id
        AND i2.item_code = i.item_code
    ) as `Qty`
FROM
    shipment_box sb
LEFT JOIN
    shipment_unit su
ON
    su.shipment_box_id = sb.shipment_box_id
LEFT JOIN
    shipment_consumable sc
ON
    sc.shipment_box_id = sb.shipment_box_id
LEFT JOIN
    item i
ON
    (
        i.item_id = su.item_id
    OR
        i.item_id = sc.item_id
    )
GROUP BY
    sb.box_code,
    sb.shipment_box_id,
    i.item_code
ORDER BY
    box_code,
    item_code
;

Note that we have to add sb.shipment_box_id to the GROUP BY, since we're using that for the SELECT COUNT(*) subqueries.

This is making a number of allowances:

  • It will work even if item_code is not unique (for instance, if you have item_id 27 that's a unit, and item_id 31 that's a consumable, but (since they're actually the same thing) both rows have item_code 'ITM-027')
  • It should work if the above situation is true, and (for some bizarre reason) you've got a box with item_id 27 in shipment_unit, AND item_id 31 in shipment_consumable. Since it's counting the occurences in both tables regardless, it would correctly generate a count of 2 for item_code 'ITM-027'

If (as I would tend to assume) every item_code has one and only one item_id, We can avoid the extra SELECT COUNT(*) queries altogether.

SELECT
    sb.box_code,
    i.item_code, -- Display item_code from joined item table
    CASE i.item_type
      WHEN 1
        THEN COUNT(DISTINCT su.shipment_unit_id)
      WHEN 2
        THEN COUNT(DISTINCT sc.shipment_consumable_id)
      ELSE NULL
    END as `Qty`
FROM
    shipment_box sb
LEFT JOIN
    shipment_unit su
ON
    su.shipment_box_id = sb.shipment_box_id
LEFT JOIN
    shipment_consumable sc
ON
    sc.shipment_box_id = sb.shipment_box_id
LEFT JOIN
    item i
ON
    (
        i.item_id = su.item_id
    OR
        i.item_id = sc.item_id
    )
GROUP BY
    sb.box_code,
    i.item_code,
    i.item_type
ORDER BY
    box_code,
    item_code
;

Here, we use i.item_type' to determine what value toCOUNT, so we need that in theGROUP BY`.

We're back to basing our COUNT on our result set. As noted before, we could see our incidence of unit or consumable items multiplied through the JOIN. However, we're not counting the rows directly; we're counting the number of distinct IDs from shipment_unit or shipment_consumable. This factors those multiple rows right back out again, and we'll get correct counts.

Finally, here's a DB-Fiddle link showing your two queries (tweaked slightly so the work), and my three, in action.

I'd recommend my 2nd query (COUNT via sub-queries) if item_code is not unique, and my 3rd (COUNT DISTINCT) is it is unique.