MySQL hitting IOPS wall in AWS, not able to utilize full volume IOPS level

amazon ec2awsMySQL

We are scaling up an existing MySQL 5.6.41 (with Galera) based system running in AWS on EC2, and are hitting a wall in performance, which we believe is narrowed down to storage IOPS.

Please note for the sake of testing baseline MySQL performance the Galera cluster consists of a single node – no node-node replication is occurring.

Setup

Database consists of several tables, with one having ~10M rows and handling a lot of read write volume. Both read and write occurring across different rows, there is no row contention (apart from any index range locking that may occur).

The node is running on an AWS R5.2xlarge (8 core, 64GB ram) against an EBS volume with provisioned IOPS.

Our load testing consists of two 8 core boxes in the same availability zone and private subnet spinning up 150+ client connections and hammering the server using the same access patterns seen in the live running system (read single row by key, modify, and write updated row).

MySQL is configured to allow 500 connections (also buffer pool is 48GB).

Symptoms

Initially the volume IOPS were set to 1000, during that time we were able to perform load testing and see the volume IOPS stats in Cloud Watch peg at the maximum (1000).

When setting the volume IOPS to 2000 we were able to peg the volume IOPS at the maximum (2000) and see an equivalent increase in the transactions/second.

However at both 3000 and 5000 load testing through MySQL would not exceed 2000 IOPS, and as expected we are also not seeing any increase in transactions/second. Likewise sysbench against MySQL could not exceed 2000 IOPS either.

Please note the VM and volume appear to be configured correctly as io testing in fio does push the IOPS all the way to the limit (3000 and 5000 respectively).

Also note MySQL is not hitting the volume throughput limit either, and the VM cpu is barely hitting 10% utilization. MySQL largely seems idle during load testing.

Is the problem something other than IOPS?

We don’t think so, here’s why – our initial concern was related to table contention, number of connections, number of request, or data bandwidth.

So here is what we did to try to confirm these could be related:

  1. Table contention

    a. We duplicated main read/write table used in the load test, and split the test clients each reading / writing from one of the two cloned tables – result: no change in either transactions/second or IOPS used

    b. Switched the main read/write table used in the load test to use partitioning – result: no change in either transactions/second or IOPS used

  2. Number of connections, requests, or data bandwidth

    a. Set load testing to read then write the record data without making any changes, while this eliminates the final io write MySQL still services the requests and (we believe) performs normal fetching / locking associated with the request – result: significantly higher transactions/second but IOPS still limited to 2000

We tend to believe the noop update test above (#2) would likely yield even higher performance, and the IOPS are once again the limiting factor.

What have we tried to resolve the IOPS limit seen in our MySQL node?

We’ve tried modifying the following configuration items based on tuning articles targeting high transaction systems, SSD storage, and general MySQL recommendations.

These were added and tweaked, both individually and together, but have seen no improvement. In many case there was no change in performance either bad or good:

innodb_buffer_pool_instances=48

innodb_flush_method=O_DIRECT

innodb_io_capacity=3000
innodb_io_capacity_max=5000
innodb_read_io_threads=16
innodb_write_io_threads=16

innodb_log_file_size=6GB
innodb_log_files_in_group=2

innodb_checksum_algorithm=crc32
innodb_flush_neighbors=0
innodb_page_size=64KB
innodb_flush_log_at_trx_commit=0

query_cache_size=0
query_cache_type=0
thread_cache_size=32
thread_concurrency=20
transaction_isolation=READ-COMMITTED

Otherwise our base configuration is rather standard:

[mysqld]
innodb_buffer_pool_size=48G
max_connections=500
skip_name_resolve=ON

wsrep_slave_threads=32
wsrep_provider_options="evs.send_window=1024; evs.user_send_window=512"

I’m certain the problem is something we’ve misconfigured.

Any suggestions or thoughts?

Best Answer

There are many reasons that may lead to the inability to use all the IOPs available. Here's an unordered list of thoughts.

  • How many threads are being invoked in the benchmark?
  • More than about 40 threads may get into a "thundering herd" scenario, where none of the threads are running efficiently because they are stepping on each other. My favorite analogy is when there are too many people in a grocery store, and the carts are log jammed.
  • 10M rows? Sounds like you don't need anywhere near the 48GB of ram for the buffer_pool?
  • What are the settings for double-write, flush_log_at_trx_commit, etc. These can be changed to get better performance with lower IOPs. (OK that goes contrary to your benchmarking goal.) These settings can be relaxed if you go to a multi-node Galera.
  • 5.7 and 8.0 have further improvements.
  • Could it be that each connection is doing very little, hence most of the time is spent building and tearing down connections? The things that Wilson asked for will help us analyze that.
  • Does your load testing test your actual queries? Otherwise, benchmarks tend to provide dubious results.
  • I rarely see a MySQL server maxing out the CPU. When I do, it is almost always fixable by adding an index or reformulating a query. That is, don't be concerned about low CPU usage in your test.
  • PARTITIONing is almost never a performance benefit.
  • UPDATEs can be costly. A copy of the row is kept for potential ROLLBACK. If two connections are hitting the same row, one may be delayed or killed ("deadlock"). Be sure your benchmark mimics the real system with respect to the randomness of rows being updated.

Something to try: run your benchmark with a limit of 40 connections. It may be that the "transactions per second" will increase. Note: This metric is probably more important than IOPs utilization.

Perhaps your testing indicates that the system is bigger than you need?