MySQL hitting IOPS wall in AWS, not able to utilize full volume IOPS level

amazon ec2awsMySQL

We are scaling up an existing MySQL 5.6.41 (with Galera) based system running in AWS on EC2, and are hitting a wall in performance, which we believe is narrowed down to storage IOPS.

Please note for the sake of testing baseline MySQL performance the Galera cluster consists of a single node – no node-node replication is occurring.

Setup

Database consists of several tables, with one having ~10M rows and handling a lot of read write volume. Both read and write occurring across different rows, there is no row contention (apart from any index range locking that may occur).

The node is running on an AWS R5.2xlarge (8 core, 64GB ram) against an EBS volume with provisioned IOPS.

Our load testing consists of two 8 core boxes in the same availability zone and private subnet spinning up 150+ client connections and hammering the server using the same access patterns seen in the live running system (read single row by key, modify, and write updated row).

MySQL is configured to allow 500 connections (also buffer pool is 48GB).

Symptoms

Initially the volume IOPS were set to 1000, during that time we were able to perform load testing and see the volume IOPS stats in Cloud Watch peg at the maximum (1000).

When setting the volume IOPS to 2000 we were able to peg the volume IOPS at the maximum (2000) and see an equivalent increase in the transactions/second.

However at both 3000 and 5000 load testing through MySQL would not exceed 2000 IOPS, and as expected we are also not seeing any increase in transactions/second. Likewise sysbench against MySQL could not exceed 2000 IOPS either.

Please note the VM and volume appear to be configured correctly as io testing in fio does push the IOPS all the way to the limit (3000 and 5000 respectively).

Also note MySQL is not hitting the volume throughput limit either, and the VM cpu is barely hitting 10% utilization. MySQL largely seems idle during load testing.

Is the problem something other than IOPS?

We don’t think so, here’s why – our initial concern was related to table contention, number of connections, number of request, or data bandwidth.

So here is what we did to try to confirm these could be related:

Table contention

a. We duplicated main read/write table used in the load test, and split the test clients each reading / writing from one of the two cloned tables – result: no change in either transactions/second or IOPS used

b. Switched the main read/write table used in the load test to use partitioning – result: no change in either transactions/second or IOPS used
Number of connections, requests, or data bandwidth

a. Set load testing to read then write the record data without making any changes, while this eliminates the final io write MySQL still services the requests and (we believe) performs normal fetching / locking associated with the request – result: significantly higher transactions/second but IOPS still limited to 2000

We tend to believe the noop update test above (#2) would likely yield even higher performance, and the IOPS are once again the limiting factor.

What have we tried to resolve the IOPS limit seen in our MySQL node?

We’ve tried modifying the following configuration items based on tuning articles targeting high transaction systems, SSD storage, and general MySQL recommendations.

These were added and tweaked, both individually and together, but have seen no improvement. In many case there was no change in performance either bad or good:

innodb_buffer_pool_instances=48

innodb_flush_method=O_DIRECT

innodb_io_capacity=3000
innodb_io_capacity_max=5000
innodb_read_io_threads=16
innodb_write_io_threads=16

innodb_log_file_size=6GB
innodb_log_files_in_group=2

innodb_checksum_algorithm=crc32
innodb_flush_neighbors=0
innodb_page_size=64KB
innodb_flush_log_at_trx_commit=0

query_cache_size=0
query_cache_type=0
thread_cache_size=32
thread_concurrency=20
transaction_isolation=READ-COMMITTED

Otherwise our base configuration is rather standard:

[mysqld]
innodb_buffer_pool_size=48G
max_connections=500
skip_name_resolve=ON

wsrep_slave_threads=32
wsrep_provider_options="evs.send_window=1024; evs.user_send_window=512"

I’m certain the problem is something we’ve misconfigured.

Any suggestions or thoughts?

Best Answer

There are many reasons that may lead to the inability to use all the IOPs available. Here's an unordered list of thoughts.

How many threads are being invoked in the benchmark?
More than about 40 threads may get into a "thundering herd" scenario, where none of the threads are running efficiently because they are stepping on each other. My favorite analogy is when there are too many people in a grocery store, and the carts are log jammed.
10M rows? Sounds like you don't need anywhere near the 48GB of ram for the buffer_pool?
What are the settings for double-write, flush_log_at_trx_commit, etc. These can be changed to get better performance with lower IOPs. (OK that goes contrary to your benchmarking goal.) These settings can be relaxed if you go to a multi-node Galera.
5.7 and 8.0 have further improvements.
Could it be that each connection is doing very little, hence most of the time is spent building and tearing down connections? The things that Wilson asked for will help us analyze that.
Does your load testing test your actual queries? Otherwise, benchmarks tend to provide dubious results.
I rarely see a MySQL server maxing out the CPU. When I do, it is almost always fixable by adding an index or reformulating a query. That is, don't be concerned about low CPU usage in your test.
PARTITIONing is almost never a performance benefit.
UPDATEs can be costly. A copy of the row is kept for potential ROLLBACK. If two connections are hitting the same row, one may be delayed or killed ("deadlock"). Be sure your benchmark mimics the real system with respect to the randomness of rows being updated.

Something to try: run your benchmark with a limit of 40 connections. It may be that the "transactions per second" will increase. Note: This metric is probably more important than IOPs utilization.

Perhaps your testing indicates that the system is bigger than you need?

Related Solutions

Mysql – know about the IOPS (I/O Per Second) and How it influences the DB CRUD operation

A hard drive's IOPS tells you how many I/O operations that particular drive can perform per second. Since random reads, random writes, sequential reads, and sequential writes have different performance characteristics, if you're looking at a single number, that implies that this is a weighted average of these four numbers that represents some workload. That workload may or may not be similar to the workload that a database would actually perform so your system may end up getting more or fewer operations per second.

A SQL query would, assuming it does physical I/O (many queries do not need to do any I/O because they are simply reading from data already in the database's memory or in the file system cache or in the SAN cache), tend to perform many I/O operations. If your queries are actually doing physical I/O, a single query could easily do 100 I/O operations per second. You could potentially have 100 queries running simultaneously each only getting to do 1 physical I/O operation per second but that would likely mean that all 100 queries would take an unacceptable amount of time to return.

Realistically, if you need to handle 3000 queries per second, you'll need to ensure that the vast majority of your queries do not need to do physical I/O by ensuring that the data they need is cached (preferrably by the database). And you'll need to spread the I/O over a relatively large number of drives (with an appropriate RAID configuration) to increase the total IOPS that the system can perform. This sort of capacity planning is going to require a reasonably good understanding of your application, your data, how much I/O each query is likely to really need, etc. And you'll likely need to balance cost vs. performance-- adding memory or adding drives or changing the RAID configuration will involve different tradeoffs between cost, performance, and available space that will have to be considered. You may also want to consider splitting the workload among multiple servers (which may require architectural changes to your application).

Mysql – Show processlist / Max concurrent connections seem to max out at 131

I once wrote about the kind of things to tweak when seeing output from mysqltuner.pl : Understanding mysqltuner Recomendations w/ Unused Catalogs

Here is the kind of output it generates

 >>  MySQLTuner 1.2.0 - Major Hayden <major@mhtx.net>
 >>  Bug reports, feature requests, and downloads at http://mysqltuner.com/
 >>  Run with '--help' for additional options and output filtering

-------- General Statistics --------------------------------------------------
[--] Skipped version check for MySQLTuner script
[OK] Currently running supported MySQL version 5.1.61
[OK] Operating on 64-bit architecture

-------- Storage Engine Statistics -------------------------------------------
[--] Status: +Archive -BDB -Federated +InnoDB -ISAM -NDBCluster 
[--] Data in MyISAM tables: 6M (Tables: 30)
[--] Data in InnoDB tables: 359M (Tables: 1206)
[!!] Total fragmented tables: 21

-------- Security Recommendations  -------------------------------------------
[!!] User '@localhost' has no password set.
[!!] User 'root@127.0.0.1' has no password set.
[!!] User 'root@localhost' has no password set.

-------- Performance Metrics -------------------------------------------------
[--] Up for: 9d 15h 35m 57s (30K q [0.036 qps], 9K conn, TX: 7M, RX: 1M)
[--] Reads / Writes: 50% / 50%
[--] Total buffers: 322.0M global + 5.4M per thread (15 max threads)
[OK] Maximum possible memory usage: 402.6M (6% of installed RAM)
[OK] Slow queries: 0% (0/30K)
[OK] Highest usage of available connections: 13% (2/15)
[OK] Key buffer size / total MyISAM indexes: 2.0M/921.0K
[OK] Key buffer hit rate: 100.0% (19K cached / 8 reads)
[!!] Query cache is disabled
[OK] Sorts requiring temporary tables: 0% (0 temp sorts / 3 sorts)
[OK] Temporary tables created on disk: 0% (0 on disk / 6 total)
[!!] Thread cache is disabled
[OK] Table cache hit rate: 34% (10 open / 29 opened)
[OK] Open file limit used: 0% (19/2K)
[OK] Table locks acquired immediately: 99% (10K immediate / 10K locks)
[!!] InnoDB data size / buffer pool: 359.3M/256.0M

-------- Recommendations -----------------------------------------------------
General recommendations:
    Run OPTIMIZE TABLE to defragment tables for better performance
    Enable the slow query log to troubleshoot bad queries
    Set thread_cache_size to 4 as a starting value
Variables to adjust:
    query_cache_size (>= 8M)
    thread_cache_size (start at 4)
    innodb_buffer_pool_size (>= 359M)

From the output, you should keep watch on

Maximum possible memory usage: 402.6M (6% of installed RAM)
Total buffers: 322.0M global + 5.4M per thread (15 max threads)
Highest usage of available connections

If you see the per thread value, remember that is is made up of

added together and then multiplied by the max_connections. That can have a bearing out how much memory can be consumed. Since you are hitting the 131 connection limit, perhaps you can multiple the per thread value by 131. Once you do, look back at how much RAM is consumed. Remember to reason on the fact that you are in a VM environment. From the OS point of view, you could be swapping or be the victim of process "steal" as seen in %st in top.

Keep in mind that RAM becomes a premium commodity in a VM, and MySQL will adhere to it without question.

I hope this wild brainstorming of mine helps ...