So I have four tables as follows:
CREATE TABLE softdrinks (
drinkID INT NOT NULL AUTO_INCREMENT,
drinkName VARCHAR(100),
PRIMARY KEY (drinkID)
);
CREATE TABLE categories (
categoryID INT NOT NULL AUTO_INCREMENT,
drinkID INT NOT NULL,
categoryName VARCHAR(100),
FOREIGN KEY (drinkID) REFERENCES softdrinks(drinkID),
PRIMARY KEY (categoryID)
);
CREATE TABLE staff (
staffID INT NOT NULL AUTO_INCREMENT,
staffName VARCHAR(150),
staffAge INT NOT NULL,
PRIMARY KEY (staffID)
);
CREATE TABLE sold (
soldID INT NOT NULL AUTO_INCREMENT,
drinkID INT NOT NULL,
staffID INT NOT NULL,
amount INTEGER,
FOREIGN KEY (staffID) REFERENCES staff(staffID),
FOREIGN KEY (drinkID) REFERENCES softdrinks(drinkID),
PRIMARY KEY (soldID)
);
This is what they look like when I show all the databases on their own:
drinks table:
drinkID drinkName
1 Pepsi
2 Dr Pepper
3 Sprite
categories table:
categoryID drinkID categoryName
1 1 Cola
2 2 Cola
3 3 Orange
staff table:
staffID staffName staffAge gender
1 Bruce 35 MALE
2 Patel 65 MALE
3 Drew 25 MALE
sold table:
soldID drinkID staffID amount
1 1 1 5
2 1 2 10
3 1 3 25
4 2 2 10
5 3 1 30
6 3 2 14
7 3 3 25
So for drinkID = 1, the sold table gives
soldID drinkID staffID amount
1 1 1 5
2 1 2 10
3 1 3 25
Basically, I have a complex statement that is fetching all the data as follows:
SELECT tbl_softdrinks.drinkID, tbl_softdrinks.drinkName,
(IFNULL(tbl_sold.totalAVG, 0.0)) AS finalAVG, tbl_soldUsers.amount,
tbl_categories.categoryName,
tbl_soldUsers.staffID
FROM softdrinks tbl_softdrinks
INNER JOIN categories tbl_categories
ON tbl_softdrinks.drinkID = tbl_categories.drinkID
LEFT JOIN
(SELECT tbl_sold.drinkID, ROUND(AVG(tbl_sold.amount),1) AS totalAVG, tbl_sold.amount,
tbl_sold.staffID FROM sold tbl_sold
GROUP BY tbl_sold.drinkID) tbl_sold
ON tbl_softdrinks.drinkID = tbl_sold.drinkID
LEFT JOIN
sold tbl_soldAmount
ON tbl_softdrinks.drinkID = tbl_soldAmount.drinkID
LEFT JOIN
sold tbl_soldUsers
ON tbl_softdrinks.drinkID = tbl_soldUsers.drinkID
INNER JOIN (SELECT DISTINCT staffID, staffName, staffAge FROM staff tbl_staff
GROUP BY tbl_staff.staffID) tbl_staff
ON tbl_soldUsers.staffID = tbl_staff.staffID
WHERE tbl_softdrinks.drinkID = 1;
This is returning:
drinkID drinkName finalAVG amount categoryName staffID
1 Pepsi 13.3 5 Cola 1
1 Pepsi 13.3 5 Cola 1
1 Pepsi 13.3 5 Cola 1
1 Pepsi 13.3 10 Cola 2
1 Pepsi 13.3 10 Cola 2
1 Pepsi 13.3 10 Cola 2
1 Pepsi 13.3 25 Cola 3
1 Pepsi 13.3 25 Cola 3
1 Pepsi 13.3 25 Cola 3
As opposed to:
drinkID drinkName finalAVG amount categoryName staffID staffName staffAge
1 Pepsi 13.3 5 Cola 1 Bruce 35
1 Pepsi 13.3 10 Cola 2 Patel 65
1 Pepsi 13.3 25 Cola 3 Drew 25
This is on sqlfiddle too: http://sqlfiddle.com/#!9/8a6223/9
I believe I am returning cartesian
Would really mean a lot if someone could fix my query, not necessarily database design
Best Answer
Of course your data model could be improved, but them main problem is an unneccesary join.
You join
sold
twice ondrinkID
, there are 3 rows withdrinkID=1
, so the first join returns 3 rows and then you join each row 3 times, it's a kind of cross join resulting in 3*3 = 9.The fix is removing the 2nd join (of course simly adding DISTINCT will also work, but that's just hiding the root cause):
See fiddle