There is not a "better" or a "worse" join type. They have different meaning and they must be used depending on it.
In your case, you probably do not have employees with no work_log
(no rows in that table), so LEFT JOIN
and JOIN
will be equivalent in results. However, if you had such a thing (a new employee with no registered work_log
), a JOIN
wold omit that employee, while a left join (whose first table is employees) would show all of them, and nulls on the fields from work_log
if there are not matches.
Image by C.L. Moffatt on Code Project
Again, performance is a secondary thing to query correctness. Some people say that you shouldn't use LEFT JOIN
s. It is true that a LEFT JOIN
forces the optimizer to execute the query in one particular order, preventing some optimizations (table reordering) in some cases. Here is one example. But you should not choose one over the other if correctness/meaning is sacrificed, as an INNER JOIN
is not inherently worse. The rest of the usual optimizations apply as usual.
In summary, do not use LEFT JOIN
if you really mean INNER JOIN
.
In MySQL CROSS JOIN
, INNER JOIN
and JOIN
are the same. In the standard, and semantically, a CROSS JOIN
is an INNER JOIN
without an ON
clause, so you get every combination of rows between tables.
You have examples of all semantic types of join on Wikipedia. In practice, in MySQL, we tend to only write JOIN
and LEFT JOIN
.
First, join TABLE_2
and TABLE_3
using an inner join and additionally filtering on Value_A = 'a'
:
SELECT
t2.ID_1,
t3.ID_A,
t3.Value_A
FROM
TABLE_2 t2
INNER JOIN TABLE_3 t3 ON t2.ID_A = t3.ID_A
WHERE
t3.Value_A = 'a'
This will give you the following result set:
ID_1 ID_A Value_A
---- ---- -------
1 b a
4 d a
Now use the above as a derived table and join it, using an outer join this time, to TABLE_1
, additionally filtering the results on Value_1 = 11
:
SELECT
t1 .ID_1,
t23.ID_A,
t1 .Value_1,
t23.Value_A
FROM
TABLE_1 t1
LEFT JOIN
(
SELECT
t2.ID_1,
t3.ID_A,
t3.Value_A
FROM
TABLE_2 t2
INNER JOIN TABLE_3 t3 ON t2.ID_A = t3.ID_A
WHERE
t3.Value_A = 'a'
) t23 ON t1.ID_1 = t23.ID_1
WHERE
t1.Value_1 = 11
;
That will give you the output you want:
ID_1 ID_A Value_1 Value_A
---- ---- ------- -------
1 b 11 a
3 NULL 11 NULL
However, nesting a query is not the only way to solve your problem – you can also use a nested join, which is much more concise:
SELECT
t1.ID_1,
t3.ID_A,
t1.Value_1,
t3.Value_A
FROM
TABLE_1 t1
LEFT JOIN
TABLE_2 t2
INNER JOIN TABLE_3 t3 ON t2.ID_A = t3.ID_A AND t3.Value_A = 'a'
ON t1.ID_1 = t2.ID_1
WHERE
t1.Value_1 = 11
;
The last query implements exactly the same logic as the previous query: first, tables TABLE_2
and TABLE_3
are joined and the result is filtered, then it is joined to TABLE_1
and the final set is filtered again. Some people also add brackets around a nested join:
FROM
TABLE_1 t1
LEFT JOIN
(
TABLE_2 t2
INNER JOIN TABLE_3 t3 ON t2.ID_A = t3.ID_A AND t3.Value_A = 'a'
)
ON t1.ID_1 = t2.ID_1
to make it clearer (perhaps both for themselves and for future maintainers) that the nested join takes place before the outer-level one, logically, although the syntax is unambiguous enough without them.
Nevertheless, many people find it confusing even with brackets, and if you find yourself struggling with it as well, there is another option – right outer join:
SELECT
t1.ID_1,
t3.ID_A,
t1.Value_1,
t3.Value_A
FROM
TABLE_2 t2
INNER JOIN TABLE_3 t3 ON t2.ID_A = t3.ID_A AND t3.Value_A = 'a'
RIGHT JOIN TABLE_1 t1 ON t1.ID_1 = t2.ID_1
WHERE
t1.Value_1 = 11
;
Again, the logical sequence of events specified by the FROM clause repeats that of the derived table solution: a join between TABLE_2
and TABLE_3
, filtered, is followed by a join with TABLE_1
. The different syntax does not alter the outcome and the results produced still match your requirements.
Best Answer
First (common misunderstanding), distinct is not applied to individual columns, what you get is distinct rows. This is exactly the same as your
GROUP BY
does, so distinct is redundant.Second, you need to determine which row among duplicates that you are interested in (or get a random one). You can use window functions to achieve this by enumerating ArchiveCatalogues per Artists:
Note that I removed the group by since I'm not sure what the purpose was. Now you can select the first row from there:
The join looks strange but it's hard to tell without knowing what the tables and their key's look like.