Mysql – maximum IOPS for MySQL asynchronous replication on the slave server

innodbMySQLoptimizationperformancereplication

I have been struggling with this issue for a while now. I have a simple Master-Slave setup with MySQL 8.0.15 installed on both systems. In day-to-day operations the slave keeps up the Master with no issue, but unfortunately with our network, the connection between the two server will sometimes be down for a couple hours or more. When this happens and the connection finally comes back up, it appears the relay logs are written on the slave at the expected speed (network speed), but the application of these logs into the MySQL database is not reaching the full potential of the physical disks.

The RAM and CPU allotted are not anywhere near capped either. When replication is catching up on the slave, the disks only reach around 600KB/s, but in typical operations (Windows operations) these SSDs can reach > 500MB/s. I know these are more 'random' writes, but I was not expecting this much of a performance drop off. This is actually the same speed as a similar setup we have that use only spinning disks spec'ed at far less IOPS.

The hardware/software is the same on the master and slave: The data partition is setup in a separate RAID-10 of SSDs with the OS and application on a separate RAID 1, 64GB RAM, 88 'CPU's, both are running Windows 2016 with MySQL 8.0.15, and they are using similar configuration files. Here are the startup configs I am currently using:

[client]
port=3306
default-character-set=utf8
[mysql]
no-beep
default-character-set=utf8
[mysqld]
port=3306
datadir="W:/Data"
character-set-server=utf8
default_authentication_plugin=mysql_native_password
default-storage-engine=INNODB
sql-mode="STRICT_TRANS_TABLES,NO_ENGINE_SUBSTITUTION"
log-output=FILE
general-log=0
general_log_file="S:/general.log"
slow-query-log=1
slow_query_log_file="S:/slow.log"
long_query_time=10
log-bin="S:/bin"
log-error="S:/error.log"
server-id=10
lower_case_table_names=1
secure-file-priv=""
loose-local-infile=1
max_connections=500
table_open_cache=2000
tmp_table_size=7G
thread_cache_size=48
myisam_max_sort_file_size=100G
myisam_sort_buffer_size=12G
key_buffer_size=11M
read_buffer_size=64K
read_rnd_buffer_size=256K
innodb_data_home_dir="W:/Data"
innodb_flush_log_at_trx_commit=1
innodb_thread_concurrency=0
innodb_autoextend_increment=64
innodb_buffer_pool_instances=8
innodb_log_buffer_size=120M
innodb_buffer_pool_size=50G
innodb_log_file_size=15G
innodb_concurrency_tickets=5000
innodb_old_blocks_time=1000
innodb_open_files=1000
innodb_stats_on_metadata=0
innodb_file_per_table=1
innodb_checksum_algorithm=0
back_log=80
flush_time=0
join_buffer_size=1M
max_allowed_packet=1G
slave_max_allowed_packet=1G
max_connect_errors=2147483648
open_files_limit=4161
sort_buffer_size=2M
table_definition_cache=1400
binlog_row_event_max_size=8K
sync_master_info=10000
sync_relay_log=10000
sync_relay_log_info=10000
loose_mysqlx_port=33060

REPLICATION

replicate-same-server-id=0
sync_binlog=1
gtid-mode=ON
enforce-gtid-consistency=true

Slave optimizations

slave-parallel-type=LOGICAL_CLOCK
slave-parallel-workers=88
slave-compressed-protocol=ON
log-slave-updates=0
binlog-group-commit-sync-delay=4000
binlog-group-commit-sync-no-delay-count=5
binlog-row-image=MINIMAL
binlog-format=STATEMENT

All of our databases use solely innodb tables. I have attempted various optimizations from my research online. I have tried adjusting slave-parallel settings and various innodb settings, none of which seem to have an affect on disk throughput when the slave is catching up. Does anyone see anything here I need to adjust or any recommendations on how to go about troubleshooting this? Or even if you think I should be looking at hardware vs. software? Thank you!

Please let me know if you need any more information.

Best Answer

The immediate fix for the Slave

SET GLOBAL innodb_flush_log_at_trx_commit = 2;

This should speed up the slave a little.

Go to my.cnf on the Slave and add

[mysqld]
innodb_flush_log_at_trx_commit = 2
innodb_flush_method=O_DIRECT

and restart mysqld.

Please read my old post Dynamic change to innodb_flush_log_at_trx_commit for more clarification.