Mysql – Limit left join for multiple rows

greatest-n-per-grouplimitsMySQLmysql-5.6subquery

Since MySql doesn't support main alias reference in subquery is a bit challenging to create query with multiple rows where you have to limit rows of left join tables.

Let's overview simple example.

Assuming there two tables:

CREATE TABLE `items` (
  `id` INT NOT NULL AUTO_INCREMENT,
  `parent` VARCHAR(45) CHARACTER SET 'utf8' COLLATE 'utf8_unicode_ci' NOT NULL,
  PRIMARY KEY (`id`));

CREATE TABLE `sub_items` (
  `id` INT NOT NULL AUTO_INCREMENT,
  `child` VARCHAR(45) CHARACTER SET 'utf8' COLLATE 'utf8_unicode_ci' NOT NULL,
  `parent_id` INT NOT NULL,
  PRIMARY KEY (`id`));

With the following content:

INSERT INTO `items` (`parent`) VALUES ('Main item1');
INSERT INTO `items` (`parent`) VALUES ('Main item2');
INSERT INTO `items` (`parent`) VALUES ('Main item3');

INSERT INTO `sub_items` (`child`, `parent_id`) VALUES ('Child Item11', '1');
INSERT INTO `sub_items` (`child`, `parent_id`) VALUES ('Child Item12', '1');
INSERT INTO `sub_items` (`child`, `parent_id`) VALUES ('Child Item13', '1');
INSERT INTO `sub_items` (`child`, `parent_id`) VALUES ('Child Item21', '2');
INSERT INTO `sub_items` (`child`, `parent_id`) VALUES ('Child Item22', '2');
INSERT INTO `sub_items` (`child`, `parent_id`) VALUES ('Child Item23', '2');
INSERT INTO `sub_items` (`child`, `parent_id`) VALUES ('Child Item24', '2');
INSERT INTO `sub_items` (`child`, `parent_id`) VALUES ('Child Item31', '3');

Now let's create query something like this:

SELECT i.*, si.child, si.parent_id FROM items as i 
LEFT JOIN sub_items as si
ON i.id = si.parent_id;

The result will be:

'1', 'Main item1', 'Child Item11', '1'
'1', 'Main item1', 'Child Item12', '1'
'1', 'Main item1', 'Child Item13', '1'
'2', 'Main item2', 'Child Item21', '2'
'2', 'Main item2', 'Child Item22', '2'
'2', 'Main item2', 'Child Item23', '2'
'2', 'Main item2', 'Child Item24', '2'
'3', 'Main item3', 'Child Item31', '3'

Now the challenge is to limit join table rows (in this example limit is 2 rows):

'1', 'Main item1', 'Child Item11', '1'
'1', 'Main item1', 'Child Item12', '1'
'2', 'Main item2', 'Child Item21', '2'
'2', 'Main item2', 'Child Item22', '2'
'3', 'Main item3', 'Child Item31', '3'

If we try to use query bellow to achieve desired outcome MySQL will throw error:

SELECT i.*, si.child, si.parent_id FROM items as i 
LEFT JOIN (SELECT * FROM sub_items WHERE parent_id = i.id LIMIT 2) as si
ON i.id = si.parent_id;

Then what is alternative approach to get result described above with 2 (or more) limit rows for each left join subquery?

Best Answer

What version of MySQL are you using? MySQL 8 supports LATERAL which you need to be able to refer to i from the derived table:

SELECT i.*, si.child, si.parent_id 
FROM items as i 
LEFT JOIN LATERAL (SELECT *
                   FROM sub_items 
                   WHERE parent_id = i.id 
                   LIMIT 2) as si
    ON i.id = si.parent_id;

Concider adding an ORDER BY to your sub-select.

Other ways to do something similar is to use a window function such as:

row_number() over (partition by ... order by ...) as rn

and filter rn at the outer level.

Window functions are however only supported in MySQL 8, but you can mimic them using variables:

SELECT i.*, si.child, si.parent_id, si.rn 
FROM items as i 
LEFT JOIN (
    SELECT c.*, 
        @row_number:=CASE WHEN @parent_id = parent_id
                          THEN @row_number + 1
                          ELSE 1
                     END AS rn,      
        @parent_id := parent_id
    FROM sub_items as c
    CROSS JOIN (select @row_number := 1) as x
    CROSS JOIN (select @parent_id := -1) as y
    ORDER BY parent_id
) as si
    ON i.id = si.parent_id
    AND si.rn <= 2
;

The idea is to order the sub-query by parent_id, increment row_num by 1 for each row, and as soon as parent_id changes, reset row_num to 1.

You can read more about the idea at for example: mysql-row_number