MySQL Query Performance – ORDER BY Slow with Join, Fast as Two Queries

MySQLorder-byquery-performance

Why is the following query slow but fast when I provide the values inline?

select u.* from user u
join user_group g on u.group_id = g.id
where g.account_id = 1
order by u.id limit 10;
-- takes ~30ms
select id from user_group where account_id = 1;
-- which is (99,198,297,396,495,594,693,792,891,990)

select * from user
where group_id in (99,198,297,396,495,594,693,792,891,990)
order by id limit 10;
-- takes ~1ms

Sub query is also slow. The plan is the same as the join.

select u.* from user u
where u.group_id in (select id from user_group where account_id = 1)
order by u.id limit 10;
-- ~30ms

All queries produce the same result.

98  0   99
197 0   198
296 0   297
395 0   396
494 0   495
593 0   594
692 0   693
791 0   792
890 0   891
989 0   990

Schema

I've got the following simple table structure, with 100 accounts 1k user_groups and 10 million users.

create table account(
  id int primary key auto_increment
);

create table user_group(
  id int primary key auto_increment,
  account_id int not null,
  foreign key (account_id) references account(id)
);

create table user(
  id int primary key auto_increment,  
  deleted tinyint default 0,
  group_id int not null,
  foreign key (group_id) references user_group(id)
);

-- I've been trying with this index, but it doesnt seem to help.
create index user_1 on user(group_id, id, deleted);

Plans

fast explain

The plan with the join using indexes and doing temporary filesort.
slow explain

I don't understand why MySql seems to think it needs actually do the complete join to filter the data down. We clearly don't read anything from user_group.

For comparison I've tried the same thing in PostgreSQL and both queries run fast.

Why is it slow and is there a way to write the query (a single query!) that does it quickly? Sub select doesnt work.

This dbfiddle shows the problem.
https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=8ed68310d8ca72e9daef389dc0469a6f

Using MySql 8.0.23

Thanks

Edit

Here are the complete Session Status Handler debug details. It looks like the slow query is reading every value unlike the fast one.

-- FLUSH STATUS;
-- select u.* from user u join user_group g on u.group_id = g.id where g.account_id = 1 order by u.id limit 10;
-- SHOW SESSION STATUS LIKE 'Handler%';

Handler_commit  1
Handler_delete  0
Handler_discover    0
Handler_external_lock   4
Handler_mrr_init    0
Handler_prepare 0
Handler_read_first  0
Handler_read_key    11
Handler_read_last   0
Handler_read_next   100110
Handler_read_prev   0
Handler_read_rnd    0
Handler_read_rnd_next   0
Handler_rollback    0
Handler_savepoint   0
Handler_savepoint_rollback  0
Handler_update  0
Handler_write   0
-- FLUSH STATUS;
-- set @uGroups := (select group_concat(id) from user_group where account_id = 1 group by account_id);
-- select * from user where group_id in (select @uGroups) order by id limit 10;
-- SHOW SESSION STATUS LIKE 'Handler%';

Handler_commit  2
Handler_delete  0
Handler_discover    0
Handler_external_lock   4
Handler_mrr_init    0
Handler_prepare 0
Handler_read_first  0
Handler_read_key    2
Handler_read_last   0
Handler_read_next   19
Handler_read_prev   0
Handler_read_rnd    0
Handler_read_rnd_next   0
Handler_rollback    0
Handler_savepoint   0
Handler_savepoint_rollback  0
Handler_update  0
Handler_write   0

Best Answer

Investigate the plan. MySQL alters the order of the tables scanning.

Add STRAIGHT_JOIN. https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=592b1fb77c5c383abb8d93f9d9e13559