Mysql – Why does this Entity Framework query perform so bad in MySQL

entity-frameworkMySQLormperformance

Let me give you a little setup. We have a InnoDB table on MySQL 5.1 with close to 20 million records, no foreign keys, and proper indexes for queries we do. We are using the latest 6.3.5 MySQL release for the .NET Entity Framework. Normally we are use to dealing with SQL Server and the Entity Framework, but on this project we decided to give MySQL a try.

I have some theories on the issue but let me first do the code setup

EF LINQ Query

var failsForAcct1001 = db.Failures.Where(x => x.AccountId == 1001);
/* farther down and later on in the code */
return failsForAcct1001.OrderBy(x => x.FailureId).Take(50);

MySQL Code Generated

Please ignore the column names they aren't nessisary to understand the problem

select
    External1.FailureId,
    External1.col2,
    External1.col3,
from (select
          Inner1.FailureId,
          Inner1.col2,
          Inner1.col3,
      from Failures Inner1
      where External1.AccountId = 1001
    ) External1
order by External1.FailureId
limit 50

This SQL being generated is very similar to what happens in SQL Server, and SQL handles it with out a problem. And this isn't just an Entity Framework issue, when I take this query and execute in MySQL Workbench it also times out. However the following query returns in under a half second.

select
    External1.FailureId,
    External1.col2,
    External1.col3,
from Failures External1
where External1.AccountId = 1001
order by External1.FailureId
limit 50

I assume this performance problem has to do with the fact that MySQL is trying to execute the inner query which pulls back all the records in the database and then tries to sort and take 50, instead of doing this all as one operation like the second query does.

I really just want to confirm my observations, and ask if there is anything I can do to get the first SQL statement to execute faster, without modifying the query in any way.

Best Answer

Step 1 - get an explain plan, in particular EXPLAIN EXTENDED will show you what SQL the query optimizer actually generated. Maybe additional indexes will help - but if you "outsource" your SQL generation then you really are at the mercy of whatever was easier for the ORM developer, not what was best for your application. Another thing you might try is in-memory temp db, as this can help with sorts, and derived tables, which you do have...