Mysql – MyISAM Performance: Join Decomposition

myisamMySQLperformance

in High Performance MySQL on page 159 they talk about breaking up complex queries into simple ones:

Converting

SELECT * FROM tag
JOIN tag_post ON tag_post.tag_id=tag.id
JOIN post ON tag_post.post_id=post.id
WHERE tag.tag='mysql';

To

SELECT * FROM tag WHERE tag='mysql';
SELECT * FROM tag_post WHERE tag_id=1234;
SELECT * FROM post WHERE post.id in (123,456,567,9098,8904);

And sort of doing the actual join yourself in your application.

My Question is whether this is stil such a good idea when the final query has a where-clause with a few thousand IDs it needs to match (the actual table itself has about 500k entries).

What I mean is, will there be a big penalty for having a query like

SELECT * FROM post WHERE post.id in (123,456,567, ... <a few thousand IDs here> ... ,9098,8904);

instead of the join-statement above? Would it help to move this logic to Stored Procedures inside the Database (while considering how poorly stored procedures are implemented in MySQL)?

Best Answer

I have done this in a few places. Doing multiple simple queries and building an ID list in the application logic, even with the ID list containing 10,000+ ID's made significant performance increases. The table I was querying had around 5 million records and doing a JOIN was painfully slow. After switching to using IN with an ID list it took about 1% of the time the JOIN took.