MySQL RDS Instance eating up memory and swapping

amazon-rdsinnodbmemoryMySQLmysql-5.5

I have an issue with memory on a RDS instance which is of db.m2.4xlarge class.

Below are the specifications of the RDS instance

Instance Class: db.m2.4xlarge ( 68.4 GB of RAM )

Engine: mysql (5.5.37)

IOPS: 8000

Storage: 805GB ( 128 GB of data )

Multi AZ: Yes

Automated Backups: Enabled (5 Days)

The mysql instance is tuned without changing the InnoDB buffers values.

Total buffers : 23.9G global + 3.0M per thread (1000 max threads)

Maximum possible memory usage: 26.8G (42% of installed RAM)

Data in InnoDB tables: 98G

Earlier, we used to have higher buffer value which was more optimized, but the frequency of reboots made us to reduce it back to the default buffer size as set by RDS.

Our application works from 4 ColdFusion server and 4 asp.net servers, they all communicate to the concerned MySQL RDS instance. We are using connection pooling for the applications. All the DML statements are done using stored procedures and they are executed very frequently.

My issue is that the RDS instance if left alone, slowly eats the memory and starts to swap. When it swaps our applications starts to fail one by one. So we are left with the only option of rebooting the instance. Currently, we reboot the server, whenever the memory goes below 5 GB.

Another approach that we take is restart cold fusion servers, restart IIS and then issue a "flush tables" query. But this yields results at times, but at times, there is no change. Theoretically, I believe even without flush tables query, the memory should be released, which doesn't happen. Again, the released memory is usually within a range of 1 – 7 GB.

The interactive timeout set for MySQL is 300 seconds and ColdFusion connection lifetime is 300 seconds and the timeout set in asp.net is 270 seconds.

I am unable to track down what is using the memory and keeping it to itself. The memory should ideally get freed when MySQL needs it. But it is not happening.

I need some guidance on how to track down the memory hog, so that the RDS instance goes on functioning without a need for reboot.

We already have MONyog monitoring the RDS. Also, have enabled Global Status History (GoSH) in RDS, which hasn't been really useful for me.

Already I have seen similar issue mentioned here @ When should I think about upgrading our RDS MySQL instance based on memory usage?
We already have done all that is recommended there as a part of DB tuning.

Below are the graphs

The graph depicting 3 weeks of RDS

This graph depicts last 3 weeks

The graph starts with a reboot, which happened after the memory went down to 0 and started swapping. After which there has been 2 reboots.

The present week

The current week

There are 2 memory releases that happened on August 19th, 2014. One that relieved 6GB was done by restarting services and flushing table as explained below.

Past 24 Hours

This is how past 24 hours has been.

EDIT

Adding further details as needed.

Processlist is as below

mysql> SHOW PROCESSLIST;

+——–+—————–+—————————————————-+————+———+——-+—————————–+——————+ | Id | User | Host
| db | Command | Time | State | Info
|
+——–+—————–+—————————————————-+————+———+——-+—————————–+——————+ | 1 | event_scheduler | localhost
| NULL | Daemon | 30 | Waiting for next activation | NULL
| | 332 | user | ip3:32355 |
NULL | Sleep | 1 | | NULL
| | 836 | rdsadmin | localhost:40463
| mysql | Sleep | 6 | | NULL
| | 345919 | user | server2:49173 | NULL |
Sleep | 0 | | NULL | |
354132 | user | server2:49641 | NULL |
Sleep | 82 | | NULL | |
386366 | user | ip1.us-west-2.compute.internal:64097 | db | Sleep
| 1 | | NULL | | 389625 |
user | ip2.us-west-2.compute.internal:59819 | db | Sleep | 0
| | NULL | | 390109 | user
| ip1.us-west-2.compute.internal:64879 | db | Sleep | 1 |
| NULL | | 391366 | user |
ip2.us-west-2.compute.internal:60319 | db | Sleep | 1 |
| NULL | | 392045 | user | ip3:39593
| NULL | Sleep | 20193 | | NULL
| | 393625 | user | ip3:26708 | db
| Sleep | 3902 | | NULL | |
393626 | user | ip3:37544 | NULL
| Sleep | 5677 | | NULL | |
393933 | user | ip2.us-west-2.compute.internal:60912 | db | Sleep
| 0 | | NULL | | 394462 |
user | ip1.us-west-2.compute.internal:49971 | db | Sleep |
3 | | NULL | | 394802 | user
| ip7:1142 | db | Sleep | 88 |
| NULL | | 395410 | user |
ip2.us-west-2.compute.internal:64725 | db | Sleep | 4 |
| NULL | | 396217 | user |
ip2.us-west-2.compute.internal:64891 | db | Sleep | 12 |
| NULL | | 396581 | user | ip7:1423
| db | Sleep | 22 | | NULL
| | 396731 | user | ip7:1429 | db
| Sleep | 21 | | NULL | |
396954 | user | ip1.us-west-2.compute.internal:50472 | db | Sleep
| 7 | | NULL | | 398509 |
user | ip1.us-west-2.compute.internal:50595 | db | Sleep |
179 | | NULL | | 399337 | user
| ip3.us-west-2.compute.internal:49539 | db | Sleep | 219 |
| NULL | | 399338 | user |
ip3.us-west-2.compute.internal:49540 | db | Sleep | 219 |
| NULL | | 399360 | user |
ip4.us-west-2.compute.internal:62560 | db | Sleep | 0 |
| NULL | | 399363 | user |
ip4.us-west-2.compute.internal:62568 | db | Sleep | 0 |
| NULL | | 399580 | user |
ip5.us-west-2.compute.internal:50055 | db | Sleep | 1 |
| NULL | | 399581 | user |
ip6.us-west-2.compute.internal:58023 | db | Sleep | 1 |
| NULL | | 399607 | user |
ip6.us-west-2.compute.internal:58034 | db | Sleep | 1 |
| NULL | | 399608 | user |
ip6.us-west-2.compute.internal:58036 | db | Sleep | 1 |
| NULL | | 399609 | user |
ip6.us-west-2.compute.internal:58037 | db | Sleep | 1 |
| NULL | | 399669 | user |
ip4.us-west-2.compute.internal:62615 | db | Sleep | 0 |
| NULL | | 399682 | user | ip3:9507
| NULL | Query | 0 | NULL | SHOW
PROCESSLIST | | 399696 | user | server1:53241 |
db | Sleep | 1 | | NULL
| | 399704 | user | server1:53242 | db | Sleep
| 0 | | NULL | | 399705 |
user | ip4.us-west-2.compute.internal:62618 | db | Sleep |
0 | | NULL | | 399706 | user
| ip4.us-west-2.compute.internal:62619 | db | Sleep | 0 |
| NULL | | 399707 | user |
ip4.us-west-2.compute.internal:62620 | db | Sleep | 0 |
| NULL |
+——–+—————–+—————————————————-+————+———+——-+—————————–+——————+

SHOW ENGINE INNODB STATUS

I was not able to format the data properly, so I am putting that over pastebin
http://pastebin.com/JETQ6KG3

Thank you.

The server was again rebooted in the morning to free up memory.

Best Answer

Memory is not instrumented in MySQL until version 5.7 (currently in development), so this does make your question a bit of a guessing game.

I can see from inside InnoDB status, that it doesn't appear to be InnoDB consuming the memory (assuming you collected this from when the problem was occurring):

Total memory allocated 26217103360; in additional pool allocated 0
Dictionary memory allocated 1212141
Buffer pool size   1563519

Total memory: 24.41 GiB (InnoDB needs about 5-10% more than your buffer pool for internal structures) Buffer Pool memory: ~23.85 GiB

I'm going to have to guess that the memory is consumed at the MySQL layer (not InnoDB), and two common causes are intrinsic temporary tables, and very large sort/join buffers.

Related Solutions

Mysql – When should I think about upgrading our RDS MySQL instance based on memory usage

Just ran across this question, and can give you a couple of suggestions.

If you're trying to gauge when to move to a bigger instances (vs. increasing the size of the databases) when you're getting close to hitting some sort of resource limit: Memory, I/O and CPU all have the ability to limit your performance.

The symptom you mention in the question may be a symptom that you're using more and more memory - memory used for cache & buffers are being reclaimed for other uses. Upgrading to a large instance type will increase the amount of available memory.

The chart that Rolando posted should give you a good guideline as to the number of available connections - if you're hitting those limits, then it's time to upgrade. Available connections are directly related to available memory, so hitting connection limits means you should upgrade.

If your CPU is averaging > 50% or so, you might want to start planning on an upgrade.

Finally, if you see consistently high I/O you may want to consider a larger instance (generally, the more powerful the instance the better the I/O) or using provisioned IOPS.

Mysql – Struggling to debug high CPU usage on Amazon RDS MySQL instance

Managed to solve this, these are the steps I followed:

Firstly, I contacted the Amazon RDS team by posting on their discussion forum, they confirmed it was the mysqld process taking up all this CPU - this eliminated a configuration fault with something else running on the physical server

Secondly I tracked down the source of the queries that were running:

SELECT `mytable`.* FROM `mytable` WHERE `mytable`.`foreign_key` = 231273 LIMIT 1

I originally overlooked this as the cause, because none of these queries seemed to be taking particularly long when I monitored the show processlist output. After exhausting other avenues, I decided it might be worth following up....and I'm glad I did.

As you can see in the show processlist output, these queries were coming from a utlility server, which runs some tactical utility jobs that exist outside of our main application code. This is why they were not showing up as slow or causing issues in our new relic monitoring, because the new relic agent is only installed on our main app server.

Loosely following this guide:

http://www.mysqlperformanceblog.com/2007/02/08/debugging-sleeping-connections-with-mysql/

I was able to trace these queries to a specific running process on our utility server box. This was a bit of ruby code that was very inefficiently iterating through around 70,000 records, checking some field values and using those to decide whether it needs to create a new record in 'mytable.' After doing some analysis I was able to determine, the process was no longer needed so could be killed.

Something that was making matters worse, there seemed to be 6 instances of this same process running at one time due to the way the cron job was configured and how long each one took! I killed off these processes, and incredibly our CPU usage fell from around 100% to around 5%!

Below are the specifications of the RDS instance

Best Answer

Related Solutions

Mysql – When should I think about upgrading our RDS MySQL instance based on memory usage

Mysql – Struggling to debug high CPU usage on Amazon RDS MySQL instance

Related Question