I'm trying to run some reporting queries on a MySQL database, and I'm running into serious performance issues for some reason. One table, page_views
, is extremely large (~70 million records), so some slowness is to be expected, but it's got lots of indexing, and all the queries are restricted to a certain date range on the started_at
field. I'm seeing simple joins taking upwards of ten minutes. The query I've been tossing around lately (which I've stripped way down) looks like this:
select count(page_views.id) as views
from playpack_media
join page_views on page_views.itemId=playpack_media.media_Id
where playpack_media.playpack_id = 1
and page_views.started_at BETWEEN '2014-06-23' and '2014-07-07'
It's taking about 11 minutes 30 seconds to run. DESCRIBE
returns this analysis:
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+----------------+------+----------------------+-------------+---------+-------------------------------------+------+-------------+
| 1 | SIMPLE | playpack_media | ref | media_id,playpack_id | playpack_id | 4 | const | 594 | |
| 1 | SIMPLE | page_views | ref | itemId,started_at | itemId | 4 | db_playsmrt.playpack_media.media_id | 4236 | Using where |
It seems to bottleneck at the "copying to tmp table" stage. I've tried increasing tmp_table_size and max_heap_table_size to 256M and key_buffer_size to 64M – the latter change dropped the time from ~18 minutes to ~11, which is a big improvement but still terrible, and a further increase to 128M didn't do much.
One thing I noticed while hammmering at this is that page_views
uses MyISAM and playpack_media
uses InnoDB, which I'm reading can be an issue. However, changing page_views
to InnoDB on my (much smaller) development server makes it dramatically slower. It's possible that it becomes worthwhile on larger tables, but just making the experiment would take a long time.
Does anyone have an idea what's going on here? What else should I be looking at?
Best Answer
Look at the original query
You should do a couple of things.
First, create to compound indexes
Try to refactor the query
Perhaps breaking down the views by ItemID