Your first query is asking for 20,000 rows. what is being victimized here ???
Your Innodb Buffer Pool and Log Files.
Because of SELECT * FROM history WHERE log LIKE '%a%' ORDER BY log ASC LIMIT 0, 20000;
, all the data pages for the history table is hogging InnoDB resources.
Since you are using Amazon RDS, you cannot resize the InnoDB Log Files. They are always 128M under all RDS MySQL models. Your buffer pool may or may not be full (You have 5882511360 set for it, which is 5610M or 5.4785G). Naturally, bigger model means more IOPs.
You can run this to see how full the buffer pool is
SELECT FORMAT(A.num * 100.0 / B.num,2) BufferPoolFullPct FROM
(SELECT variable_value num FROM information_schema.global_status
WHERE variable_name = 'Innodb_buffer_pool_pages_data') A,
(SELECT variable_value num FROM information_schema.global_status
WHERE variable_name = 'Innodb_buffer_pool_pages_total') B;
Another aspect I see is the log
field (TEXT). You are asking for an ORDER BY
on this big column.
You need to change the first query to throttle the SELECT. In other words, fetch smaller chunks of history.
If you cannot change the data or the query, there is nothing you can configure except to shift to m1.xlarge (Comes with 11922309120 as the Buffer Pool Size = 11370M = 11.1035G) although log file size would still be stuck at 128M.
Infrastructure aside, it is possible for SELECTs to block INSERTs, UPDATEs, and DELETEs in InnoDB.
I wrote about this before
So the three questions I'm hoping someone can answer are:
1.Should I be using a CURSOR here or is there an alternative that will get the same recursive result?
No, you should not. The CURSOR keep being opened over and over. Thinking of the overhead make me cringe. Personally, I stay away from CURSORs.
2.How can I get the results back in a single result set so that it can be used in the same fashion as a subselect?
3.What is the proper way of using the results from the CALL because I can't at least as far as I've tried get the sample SELECT statement above to work? I believe this is because I can't use CALL inline but I'm not sure.
I would like to bail you out of this by suggesting something I learned from my college days when learning about data structures. You need to perform tree traversal using a queue. This allows you to start at a root and express all descendants of a tree.
The algorithm goes like this
- STEP 01 : Start with an empty queue
- STEP 02 : Dequeue Node From the Front of the Queue
- STEP 03 : Enqueue All Children of the Latest Node
- STEP 04 : Process Info From the Latest Node
- STEP 05 : If the Queue is Not Empty, Go Back to STEP 02
- STEP 06 : All Done
This allows you to traverse a recursive structure without using Programmatic Recursion. At this point, you are probably asking: How can I traverse a tree structure without recursion?
I wrote a post about how to script three Stored Procedures that can using a loop in a single CALL that will traverse a table with nodes and its parent in a table:
- GetParentIDByID
- GetAncestry
- GetFamilyTree
The post is Find highest level of a hierarchical field: with vs without CTEs (Oct 24, 2011). It contains the Stored Procedures already written that will traverse the following table structure:
+------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+------------+--------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| parent_id | int(11) | YES | | NULL | |
| name | varchar(255) | YES | | NULL | |
| notes | text | YES | | NULL | |
+------------+--------------+------+-----+---------+----------------+
Please read the code carefully and apply the principles.
Give it a Try !!!
Best Answer
DISADVANTAGE #1
InnoDB only has one Buffer Pool. The benefits of partitioning or compartmentalizing database tables for different clients can still be bottlenecked by database buffers (log buffer, InnoDB Buffer Pool)
DISADVANTAGE #2
Scaling becomes dependent on manual operation. This applies to scaling up disks, hardware (disk controllers), and memory.
DISADVANTAGE #3
More tables means more open file handles to cache distinct tables. Requires more flush to disk.
DISADVANTAGE #4
Introducing new groups of related tables regurgitates the first three(3) disadvantages and exacerbates the overall design.
See my other posts on such setups and other drawbacks
Nov 15, 2011
: One Big Database vs. Several Smaller OnesMar 11, 2011
: What problems will I get creating a database per customer?SUGGESTIONS
You might have to compensate these deficiencies with lots of RAM with the InnoDB Buffer Pool and the Log Buffer getting the lion's share.
You could create a hybrid setup where InnoDB Log Files are on one disk and InnoDB System Tablespace is on another, and all the data on a third disk. This concept is well expressed in YOSHINORI MATSUNOBU'S BLOG (FaceBook DB Engineer).