I personally would use a model similar to the following:
The product table would be pretty basic, your main product details:
create table product
(
part_number int, (PK)
name varchar(10),
price int
);
insert into product values
(1, 'product1', 50),
(2, 'product2', 95.99);
Second the attribute table to store the each of the different attributes.
create table attribute
(
attributeid int, (PK)
attribute_name varchar(10),
attribute_value varchar(50)
);
insert into attribute values
(1, 'color', 'red'),
(2, 'color', 'blue'),
(3, 'material', 'chrome'),
(4, 'material', 'plastic'),
(5, 'color', 'yellow'),
(6, 'size', 'x-large');
Finally create the product_attribute table as the JOIN table between each product and its attributes associated with it.
create table product_attribute
(
part_number int, (FK)
attributeid int (FK)
);
insert into product_attribute values
(1, 1),
(1, 3),
(2, 6),
(2, 2),
(2, 6);
Depending on how you want to use the data you are looking at two joins:
select *
from product p
left join product_attribute t
on p.part_number = t.part_number
left join attribute a
on t.attributeid = a.attributeid;
See SQL Fiddle with Demo. This returns data in the format:
PART_NUMBER | NAME | PRICE | ATTRIBUTEID | ATTRIBUTE_NAME | ATTRIBUTE_VALUE
___________________________________________________________________________
1 | product1 | 50 | 1 | color | red
1 | product1 | 50 | 3 | material | chrome
2 | product2 | 96 | 6 | size | x-large
2 | product2 | 96 | 2 | color | blue
2 | product2 | 96 | 6 | size | x-large
But if you want to return the data in a PIVOT
format where you have one row with all of the attributes as columns, you can use CASE
statements with an aggregate:
SELECT p.part_number,
p.name,
p.price,
MAX(IF(a.ATTRIBUTE_NAME = 'color', a.ATTRIBUTE_VALUE, null)) as color,
MAX(IF(a.ATTRIBUTE_NAME = 'material', a.ATTRIBUTE_VALUE, null)) as material,
MAX(IF(a.ATTRIBUTE_NAME = 'size', a.ATTRIBUTE_VALUE, null)) as size
from product p
left join product_attribute t
on p.part_number = t.part_number
left join attribute a
on t.attributeid = a.attributeid
group by p.part_number, p.name, p.price;
See SQL Fiddle with Demo. Data is returned in the format:
PART_NUMBER | NAME | PRICE | COLOR | MATERIAL | SIZE
_________________________________________________________________
1 | product1 | 50 | red | chrome | null
2 | product2 | 96 | blue | null | x-large
As you case see the data might be in a better format for you, but if you have an unknown number of attributes, it will easily become untenable due to hard-coding attribute names, so in MySQL you can use prepared statements to create dynamic pivots. Your code would be as follows (See SQL Fiddle With Demo):
SET @sql = NULL;
SELECT
GROUP_CONCAT(DISTINCT
CONCAT(
'MAX(IF(a.attribute_name = ''',
attribute_name,
''', a.attribute_value, NULL)) AS ',
attribute_name
)
) INTO @sql
FROM attribute;
SET @sql = CONCAT('SELECT p.part_number
, p.name
, ', @sql, '
from product p
left join product_attribute t
on p.part_number = t.part_number
left join attribute a
on t.attributeid = a.attributeid
GROUP BY p.part_number
, p.name');
PREPARE stmt FROM @sql;
EXECUTE stmt;
DEALLOCATE PREPARE stmt;
This generates the same result as the second version with no need to hard-code anything. While there are many ways to model this I think this database design is the most flexible.
As first step I merge the two product tables. I need an full outer join on product1 and product2 tables. To obtain on outer join in mysql we can union a left join and a right join. Start with a left join. In the following query IFNULL(p1.key1, p2.key1) is useful to merge two key column coming from the two product tables and IFNULL(, 'NaN') is useful to obtain 'NaN' in the output.
select IFNULL(p1.key1, p2.key1) as key1,
IFNULL(p1.key2, p2.key2) as key2,
IFNULL(p1.serial, p2.serial) as serial,
IFNULL(p1.product_data, 'NaN') as product_data1,
IFNULL(p2.product_data, 'NaN') as product_data2
from product1 p1
left join product2 p2 on p1.key1 = p2.key1 and p1.key2 = p2.key2 and p1.serial = p2.serial;
To obtain a full outer join I need to repeat the previous query with 'right join' and union the result with the 'left join'
select IFNULL(p1.key1, p2.key1) as key1,
IFNULL(p1.key2, p2.key2) as key2,
IFNULL(p1.serial, p2.serial) as serial,
IFNULL(p1.product_data, 'NaN') as product_data1,
IFNULL(p2.product_data, 'NaN') as product_data2
from product1 p1
left join product2 p2 on p1.key1 = p2.key1 and p1.key2 = p2.key2 and p1.serial = p2.serial
union
select IFNULL(p1.key1, p2.key1) as key1,
IFNULL(p1.key2, p2.key2) as key2,
IFNULL(p1.serial, p2.serial) as serial,
IFNULL(p1.product_data, 'NaN') as product_data1,
IFNULL(p2.product_data, 'NaN') as product_data2
from product1 p1
right join product2 p2 on p1.key1 = p2.key1 and p1.key2 = p2.key2 and p1.serial = p2.serial;
For convenience, I create a view with the previous query:
create or replace view p12 as
select IFNULL(p1.key1, p2.key1) as key1,
IFNULL(p1.key2, p2.key2) as key2,
IFNULL(p1.serial, p2.serial) as serial,
IFNULL(p1.product_data, 'NaN') as product_data1,
IFNULL(p2.product_data, 'NaN') as product_data2
from product1 p1
left join product2 p2 on p1.key1 = p2.key1 and p1.key2 = p2.key2 and p1.serial = p2.serial
union
select IFNULL(p1.key1, p2.key1) as key1,
IFNULL(p1.key2, p2.key2) as key2,
IFNULL(p1.serial, p2.serial) as serial,
IFNULL(p1.product_data, 'NaN') as product_data1,
IFNULL(p2.product_data, 'NaN') as product_data2
from product1 p1
right join product2 p2 on p1.key1 = p2.key1 and p1.key2 = p2.key2 and p1.serial = p2.serial;
To check we can try with:
mysql> select * from p12 where key1 = 1 and key2 =1 ;
+------+------+--------+---------------+---------------+
| key1 | key2 | serial | product_data1 | product_data2 |
+------+------+--------+---------------+---------------+
| 1 | 1 | 0 | 15.556 | 5.556 |
| 1 | 1 | 1 | 14.996 | 4.996 |
| 1 | 1 | 2 | 12.556 | NaN |
| 1 | 1 | 3 | 15.669 | NaN |
+------+------+--------+---------------+---------------+
So the he final query is:
select mi.key1, mi.key2, p12.serial, mi.info1, mi.info2, mi.date, p12.product_data1, p12.product_data2
from main_info mi
inner join p12 on mi.key1 = p12.key1 and mi.key2 = p12.key2
order by mi.key1, mi.key2, p12.serial;
+------+------+--------+-------+-------+----------+---------------+---------------+
| key1 | key2 | serial | info1 | info2 | date | product_data1 | product_data2 |
+------+------+--------+-------+-------+----------+---------------+---------------+
| 1 | 1 | 0 | 15 | 90 | 20120501 | 15.556 | 5.556 |
| 1 | 1 | 1 | 15 | 90 | 20120501 | 14.996 | 4.996 |
| 1 | 1 | 2 | 15 | 90 | 20120501 | 12.556 | NaN |
| 1 | 1 | 3 | 15 | 90 | 20120501 | 15.669 | NaN |
| 1 | 2 | 0 | 14 | 92 | 20120601 | 12.556 | 2.556 |
| 1 | 2 | 1 | 14 | 92 | 20120601 | 13.335 | 3.335 |
| 1 | 2 | 2 | 14 | 92 | 20120601 | NaN | 2.56 |
| 1 | 2 | 3 | 14 | 92 | 20120601 | NaN | 3.556 |
| 1 | 3 | 1 | 15 | 82 | 20120801 | 12.225 | 2.225 |
| 1 | 3 | 2 | 15 | 82 | 20120801 | 13.556 | 3.556 |
| 1 | 3 | 3 | 15 | 82 | 20120801 | 14.556 | NaN |
| 2 | 1 | 0 | 17 | 90 | 20130302 | 12.556 | NaN |
| 2 | 1 | 1 | 17 | 90 | 20130302 | 13.553 | NaN |
| 2 | 1 | 2 | 17 | 90 | 20130302 | 12.335 | NaN |
| 2 | 2 | 0 | 16 | 88 | 20130601 | NaN | 2.556 |
| 2 | 2 | 1 | 16 | 88 | 20130601 | NaN | 3.553 |
+------+------+--------+-------+-------+----------+---------------+---------------+
Best Answer
Create a
View
to retrieve the main record:Build the query:
OR in one query