Mysql – join 3 tables with one to many and one to one relationship


I have 3 tables:

        - Id
        - Memory_map_version
        - Product_id_hex
        - Parameter --------------------> want to add this to Histogram table
        - Id
        - Memory_map_version -------> these 2 keys are referencing to the Memory_map table
        - Product_id_hex ---------> these 2 keys are referencing to the Memory_map table
        - Id
        - Histogram_read_id

The relationship is:

Histogram_read (one) --> histogram(many)
Memory_map (one) --> histogram_read(one) 

So Now I want to create a NEW Histogram table with the [Parameter] field from Memory_map table added to the original Histogram table.

Because Memory_map (one) –> histogram_read(one) is a ONE-TO-ONE relationship, so I will need to join these two tables first before I join the Histogram_read table with Memory_map table to add the [Parameter] field to Histogram table.

So, should I do a inner join first between Histogram_read and Histogram based on Histogram_read_id? (since Histogram_read_id is a key referencing the Histogram_read table records), but since it's a ONE-TO-MANY relationship, is it ok to do the INNER JOIN?

And Because I only want all records and columns from Histogram table – just with [Parameter] field added, so if I join the Histogram_read table with Histogram table before I join the Histogram_read table with the Memory_map table, how do I make sure the final table of Histogram wasn't messed up by columns in the Histogram_read table?

I have this following query but it doesn't work:

create view histogram_new
select M.parameter, Nest.* from memory_map M
(select * from histogram_read R
INNER JOIN histogram H
ON H.histogram_read_id = Nest
ON M.memory_map_version = Nest.memory_map_version 
AND M.product_id_hex = Nest.product_id_hex;

I'm confused about the 3 tables join with one=to-many and one-to-one relationship, I am not sure what the logic should be. anyone can helps? THANKS!

Best Answer

With relationship cardinalities between tables, you should think of the outcome of INNER JOINING those tables as multiplicative per row, in a sense. In a one-to-one relationship you'll always end up with, at most, 1 x 1 record per record in the smaller table. For example if your memory_map table has 10 records and your histogram_read table has 5 records, the most records you could possibly end up with is 5 records. Those would be whichever 5 records in memory_map that had a related record to the 5 in histogram_read.

In a one-to-many relationship, the above is similarly true except the many table dictates how many rows will be returned per row in the one table. So now the multiplicative relationship is 1 x N where N is the number of related rows in the many table to 1 particular row in the one table. For example, if your histogram_read table has 5 records, and your histogram table has 20, the most records an INNER JOIN could possibly return is 20, if all 20 records related to the 5 records in histogram_read. (Then you'd be getting every row back from histogram in that case.)

So in short, with an INNER JOIN, for a one-to-one relationship, the most records you can end up with is the total number of records in the smaller table between the join. For a one-to-many relationship, the most records would be the total number of records in the larger table.

The last thing to keep in mind is INNER JOIN only returns rows that match between both tables of the join clause, and filters out the rows that don't match. OUTER JOIN like a LEFT OUTER JOIN (LEFT JOIN for short) or RIGHT OUTER JOIN (RIGHT JOIN for short) will always return all rows from the table on that side of the join clause, and will NULL out the columns of the table on the other side of the join clause where there aren't any matches. For example, if your histogram_read table has 5 records and histogram has 20 records, and only 12 of the records in histogram related to the 5 records in histogram_read then with an INNER JOIN between the two tables you would only get 12 records. But with an OUTER JOIN that points to the histogram table, such as FROM histogram_read RIGHT JOIN histogram, will ensure you get all 20 records back from the histogram table, with NULL values in the columns from histogram_read for the 8 rows that didn't match from histogram.

All of that being said, I don't see anything inherently wrong with the query you wrote, other than the INNER JOIN in your subquery possibly filtering rows out of histogram if they don't match histogram_read. You can use all OUTER JOIN clauses to prevent that from happening, and you actually don't even need a subquery. Here's an example query that should essentially produce what you're looking for:

SELECT M.parameter, R.*, H.* -- You shouldn't use *, it's bad practice, anti-pattern, and potentially bad for performance, instead explicitly list only the columns you need
FROM histogram H
LEFT JOIN histogram_read R
    ON H.histogram_read_id =
LEFT JOIN memory_map M
    ON M.memory_map_version = R.memory_map_version 
    AND M.product_id_hex = R.product_id_hex

With one-to-many relationships, I personally prefer to start with the many table in my joins, but it doesn't really matter, as long as you use the correct OUTER JOIN. In this case, because the many table is on the left side of the join clause, we need to use a LEFT JOIN. The above query will return all rows from histogram, with the parameter column from memory_map, with NULL values for wherever there wasn't a match on a record between histogram and the rest of the tables.