I have TWO QUERY which work perfect, but when i join those two query as LEFT JOIN
the output become wrong. So what is the problem.?
Query 1:
select Batsman.innings_no,
bowler,
sum(Balls) as B,
ifnull(sum(Runs_In_Over), 0) as R,
ifnull(sum(Zero), 0) as 0s,
ifnull(sum(Four), 0) as 4s,
ifnull(sum(Six), 0) as 6s
from (SELECT A.innings_no,
A.bowler,
count(*) as Balls,
Sum(B.Runs_Scored) As 'Runs_In_Over',
case B.runs_scored
when 0 then count(*) end as Zero,
case B.runs_scored
when 4 then count(*) end as Four,
case B.runs_scored
when 6 then count(*) end as Six
FROM `database`.ball_by_ball A
INNER JOIN `database`.batsman_scored B
using (match_id, over_id, ball_id, innings_no)
where match_id = 981018
and innings_no = 2
GROUP BY A.innings_no, A.bowler, B.runs_scored) as Batsman
group by innings_no, bowler order by bowler;
Query 2:
select innings_no,
bowler,
ifnull(sum(Wides), 0) as WD,
ifnull(sum(NoBalls), 0) as NB,
sum(Extra_runs) as Extra
from (SELECT D.innings_no,
D.bowler,
case E.extra_type_id
when 2 then count(*) end as Wides,
case E.extra_type_id
when 4 then count(*) end as NoBalls,
Sum(E.Extra_Runs) As 'Extra_runs'
FROM `database`.ball_by_ball D
INNER JOIN `database`.extra_runs E
using (match_id, over_id, ball_id, innings_no)
WHERE match_id = 981018
and innings_no = 2
and E.Extra_Type_Id IN (2, 4)
GROUP BY D.innings_no, D.bowler, E.extra_type_id) as Extra
group by innings_no, bowler order by bowler;
Join Query:
select Batsman.innings_no,
Batsman.bowler,
sum(Balls) as B,
ifnull(sum(Runs_In_Over), 0) as R,
ifnull(sum(Zero), 0) as 0s,
ifnull(sum(Four), 0) as 4s,
ifnull(sum(Six), 0) as 6s,
ifnull(sum(Wides), 0) as WD,
ifnull(sum(NoBalls), 0) as NB,
ifnull(sum(Extra_runs), 0) as Extra
from (SELECT A.innings_no,
A.bowler,
count(*) as Balls,
Sum(B.Runs_Scored) As 'Runs_In_Over',
case B.runs_scored
when 0 then count(*) end as Zero,
case B.runs_scored
when 4 then count(*) end as Four,
case B.runs_scored
when 6 then count(*) end as Six
FROM `database`.ball_by_ball A
INNER JOIN `database`.batsman_scored B
using (match_id, over_id, ball_id, innings_no)
where match_id = 981018
and innings_no = 2
GROUP BY A.innings_no, A.bowler, B.runs_scored) as Batsman
LEFT JOIN (SELECT D.innings_no,
D.bowler,
case E.extra_type_id
when 2 then count(*) end as Wides,
case E.extra_type_id
when 4 then count(*) end as NoBalls,
Sum(E.Extra_Runs) As 'Extra_runs'
FROM `database`.ball_by_ball D
INNER JOIN `database`.extra_runs E
using (match_id, over_id, ball_id, innings_no)
WHERE match_id = 981018
and innings_no = 2
and E.Extra_Type_Id IN (2, 4)
GROUP BY D.innings_no, D.bowler, E.extra_type_id) as Extra using (innings_no, bowler)
group by innings_no, bowler order by bowler;
Query 1 Output:
Query 2 Output:
Join Query Output:
If you saw Query 1 and 2 screenshot and compare with join query for specific Bowler
, then it's clear that when i use left join as sub query the result don't show exactly what it shows in single query.
So what is the problem.? What i need to do to solve this problem and why this problem happening?
Best Answer
You were close, here is what I think you were going for:
The difference in my query and your Join Query is that each of the subqueries is being grouped by just
innings_no
andbowler
in my query.How does group by work
The
GROUP BY
clause will restrict the resulting query to return just a single row for each unique combination of the columns listed in theGROUP BY
.In all of your queries, you are initially grouping by 3 columns, which means you are getting a separate row for each combination of
innings_no
,bowler
, and a 3rd column. For Query 1 this 3rd column isruns_scored
, for Query 2 this column isextra_type_id
. This is generally not a valid way to useGROUP BY
with aggregate functions, likeSUM
, that you used.However, in both Query 1 and Query 2, you then wrap them in ANOTHER query, having just
innings_no
andbowler
as theGROUP BY
, so you are getting the output you expected.But in Join Query you are applying the
LEFT JOIN
before applying that outerGROUP BY
, which was fixing your results in Query 1 and Query 2. Therefore, your Join Query is joining together different results than the output of the other two queries. This is why it doesn't appear to work as expected.My query should fix things, but if it doesn't provide the correct result (because I missed some valid reason you might have had for grouping the way you did) then just use the
LEFT JOIN
after applying the outerGROUP BY
s. That will likely result in worse performance, that depending on your data, that may not matter to you.LEFT JOIN Query 2 to Query 1
Here is how to simply do a LEFT JOIN of the two queries, which should yield the result you want, based on comments.
All I did to create this query was take your Query 1 and Query 2, stuck a LEFT JOIN between them, and added a
SELECT * FROM
to the start and aUSING
clause for the join. I also placed theORDER BY
outside, as it won't have any effect inside, unlike theGROUP BY
.Be aware that there is likely simpler way to get the result you want, but it requires a deeper understanding of what output you want and how your data is organized, and so is beyond the scope of this question.