Managed to solve this, these are the steps I followed:
Firstly, I contacted the Amazon RDS team by posting on their discussion forum, they confirmed it was the mysqld process taking up all this CPU - this eliminated a configuration fault with something else running on the physical server
Secondly I tracked down the source of the queries that were running:
SELECT `mytable`.* FROM `mytable` WHERE `mytable`.`foreign_key` = 231273 LIMIT 1
I originally overlooked this as the cause, because none of these queries seemed to be taking particularly long when I monitored the show processlist output. After exhausting other avenues, I decided it might be worth following up....and I'm glad I did.
As you can see in the show processlist output, these queries were coming from a utlility server, which runs some tactical utility jobs that exist outside of our main application code. This is why they were not showing up as slow or causing issues in our new relic monitoring, because the new relic agent is only installed on our main app server.
Loosely following this guide:
http://www.mysqlperformanceblog.com/2007/02/08/debugging-sleeping-connections-with-mysql/
I was able to trace these queries to a specific running process on our utility server box. This was a bit of ruby code that was very inefficiently iterating through around 70,000 records, checking some field values and using those to decide whether it needs to create a new record in 'mytable.' After doing some analysis I was able to determine, the process was no longer needed so could be killed.
Something that was making matters worse, there seemed to be 6 instances of this same process running at one time due to the way the cron job was configured and how long each one took! I killed off these processes, and incredibly our CPU usage fell from around 100% to around 5%!
Basically, you want to result the running total and group it
This requires iterative variable manipulation.
First, your sample data quadrupled:
mysql> drop database if exists all_db;
Query OK, 1 row affected (0.03 sec)
mysql> create database all_db;
Query OK, 1 row affected (0.00 sec)
mysql> use all_db
Database changed
mysql> create table table1
-> (
-> object varchar(255),
-> frequency tinyint unsigned
-> );
Query OK, 0 rows affected (0.06 sec)
mysql> insert into table1 values
-> ('object1',7),('object2',5),('object3',6),('object4',8),('object5',2);
Query OK, 5 rows affected (0.05 sec)
Records: 5 Duplicates: 0 Warnings: 0
mysql> insert into table1 select * from table1;
Query OK, 5 rows affected (0.06 sec)
Records: 5 Duplicates: 0 Warnings: 0
mysql> insert into table1 select * from table1;
Query OK, 10 rows affected (0.06 sec)
Records: 10 Duplicates: 0 Warnings: 0
mysql> select * from table1;
+---------+-----------+
| object | frequency |
+---------+-----------+
| object1 | 7 |
| object2 | 5 |
| object3 | 6 |
| object4 | 8 |
| object5 | 2 |
| object1 | 7 |
| object2 | 5 |
| object3 | 6 |
| object4 | 8 |
| object5 | 2 |
| object1 | 7 |
| object2 | 5 |
| object3 | 6 |
| object4 | 8 |
| object5 | 2 |
| object1 | 7 |
| object2 | 5 |
| object3 | 6 |
| object4 | 8 |
| object5 | 2 |
+---------+-----------+
20 rows in set (0.00 sec)
mysql>
Let's make the query that generates group values:
set @freq_limit = 20;
set @freq_count = 0;
set @freq_group = 1;
set @freq_sum = 0;
select cc,GROUP_CONCAT(object) objects from
(
select * from
(
select *,
@freq_groupinc := IF(@freq_count >= @freq_limit,1,0) bb,
@freq_count := IF(@freq_count >= @freq_limit,0,@freq_count) aa,
@freq_group := @freq_group + @freq_groupinc cc,
@freq_count := @freq_count + frequency dd
FROM table1
) AA) A GROUP BY cc;
Would you believe it works?
First I'll let you see the inner subquery's output
mysql> select * from
-> (
-> select *,
-> @freq_groupinc := IF(@freq_count >= @freq_limit,1,0) bb,
-> @freq_count := IF(@freq_count >= @freq_limit,0,@freq_count) aa,
-> @freq_group := @freq_group + @freq_groupinc cc,
-> @freq_count := @freq_count + frequency dd
-> FROM table1
-> ) AA;
+---------+-----------+----+------+------+------+
| object | frequency | bb | aa | cc | dd |
+---------+-----------+----+------+------+------+
| object1 | 7 | 0 | 0 | 1 | 7 |
| object2 | 5 | 0 | 7 | 1 | 12 |
| object3 | 6 | 0 | 12 | 1 | 18 |
| object4 | 8 | 0 | 18 | 1 | 26 |
| object5 | 2 | 1 | 0 | 2 | 2 |
| object1 | 7 | 0 | 2 | 2 | 9 |
| object2 | 5 | 0 | 9 | 2 | 14 |
| object3 | 6 | 0 | 14 | 2 | 20 |
| object4 | 8 | 1 | 0 | 3 | 8 |
| object5 | 2 | 0 | 8 | 3 | 10 |
| object1 | 7 | 0 | 10 | 3 | 17 |
| object2 | 5 | 0 | 17 | 3 | 22 |
| object3 | 6 | 1 | 0 | 4 | 6 |
| object4 | 8 | 0 | 6 | 4 | 14 |
| object5 | 2 | 0 | 14 | 4 | 16 |
| object1 | 7 | 0 | 16 | 4 | 23 |
| object2 | 5 | 1 | 0 | 5 | 5 |
| object3 | 6 | 0 | 5 | 5 | 11 |
| object4 | 8 | 0 | 11 | 5 | 19 |
| object5 | 2 | 0 | 19 | 5 | 21 |
+---------+-----------+----+------+------+------+
20 rows in set (0.00 sec)
mysql>
Look at the cc
column. It increments every time the sum exceeds @freq_limit (20) and resets.
Now just run GROUP_CONCAT on object, grouping by cc
mysql> set @freq_limit = 20;
Query OK, 0 rows affected (0.00 sec)
mysql> set @freq_count = 0;
Query OK, 0 rows affected (0.00 sec)
mysql> set @freq_group = 1;
Query OK, 0 rows affected (0.00 sec)
mysql> set @freq_sum = 0;
Query OK, 0 rows affected (0.00 sec)
mysql> select cc,GROUP_CONCAT(object) objects from
-> (
-> select * from
-> (
-> select *,
-> @freq_groupinc := IF(@freq_count >= @freq_limit,1,0) bb,
-> @freq_count := IF(@freq_count >= @freq_limit,0,@freq_count) aa,
-> @freq_group := @freq_group + @freq_groupinc cc,
-> @freq_count := @freq_count + frequency dd
-> FROM table1
-> ) AA) A GROUP BY cc;
+------+---------------------------------+
| cc | objects |
+------+---------------------------------+
| 1 | object1,object2,object3,object4 |
| 2 | object2,object3,object5,object1 |
| 3 | object4,object5,object1,object2 |
| 4 | object5,object1,object3,object4 |
| 5 | object2,object3,object4,object5 |
+------+---------------------------------+
5 rows in set (0.00 sec)
mysql>
Just set @freq_limit to whatever value you need.
To show it works for just the 5 rows you supplied:
mysql> drop database if exists all_db;
Query OK, 1 row affected (0.04 sec)
mysql> create database all_db;
Query OK, 1 row affected (0.00 sec)
mysql> use all_db
Database changed
mysql> create table table1
-> (
-> object varchar(255),
-> frequency tinyint unsigned
-> );
Query OK, 0 rows affected (0.05 sec)
mysql> insert into table1 values
-> ('object1',7),('object2',5),('object3',6),('object4',8),('object5',2);
Query OK, 5 rows affected (0.06 sec)
Records: 5 Duplicates: 0 Warnings: 0
mysql> set @freq_limit = 20;
Query OK, 0 rows affected (0.00 sec)
mysql> set @freq_count = 0;
Query OK, 0 rows affected (0.00 sec)
mysql> set @freq_group = 1;
Query OK, 0 rows affected (0.00 sec)
mysql> set @freq_sum = 0;
Query OK, 0 rows affected (0.00 sec)
mysql> select cc,GROUP_CONCAT(object) objects from
-> (
-> select * from
-> (
-> select *,
-> @freq_groupinc := IF(@freq_count >= @freq_limit,1,0) bb,
-> @freq_count := IF(@freq_count >= @freq_limit,0,@freq_count) aa,
-> @freq_group := @freq_group + @freq_groupinc cc,
-> @freq_count := @freq_count + frequency dd
-> FROM table1
-> ) AA) A GROUP BY cc;
+------+---------------------------------+
| cc | objects |
+------+---------------------------------+
| 1 | object1,object2,object3,object4 |
| 2 | object5 |
+------+---------------------------------+
2 rows in set (0.00 sec)
mysql>
Give it a Try !!!
Best Answer
One option might be using user-variables:
Here's an sql fiddle of it, though your example id=4 does not match. In the first block, it's 44, in the expected output block it is 41, so the folders are off.
Adding a caveat here. According to the documentation on user variables:
Just keep that in mind.
Edit by RolandoMySQLDBA (2012-12-17 14:30 EDT)
So as to be sure of what the user variables look like, perhaps have each of the user variables printed along the way : ( See Example )
DTest gets a +1 for his application of user variables.