MySQL – when switching to a bette server lags

MySQLmysql-5.7

I have 2 servers, Ubuntu 16.04 LTS, MySQL 5.7.21-0ubuntu0.16.04.1 – Freshly updated just in case.
64GB
SSD drives
This are not virtual servers, thous are bare metal servers.

Originally I had only 1 server that was running NGINX/PHP/MySQL but the website is growing and the IO demands are now high, so I added another server and wanted to more the MySQL to it.

I have setup only MySQL and moved the database, and copy/past the config file I had. But on the second server the database does not keep up. Really fast the requests are backlogged into "sending data" and Sleep and the server gets to "Max Connections" while the local server holding it OK.

Here is how the MySQL looks like when it's loaded.
MySQL Screen

My MySQL config is:

[mysqld]
#
# * Basic Settings
#
user            = mysql
pid-file        = /var/run/mysqld/mysqld.pid
socket          = /var/run/mysqld/mysqld.sock
port            = 3306
basedir         = /usr
datadir         = /var/lib/mysql
tmpdir          = /tmp
lc-messages-dir = /usr/share/mysql
skip-external-locking
#
# Instead of skip-networking the default is now to listen only on
# localhost which is more compatible and is not less secure.
bind-address            = X.X.X.X # Local IP address (not localhost)

#
# * Fine Tuning
#
key_buffer_size         = 16M
max_allowed_packet      = 16M
thread_stack            = 192K
thread_cache_size       = 8
# This replaces the startup script and checks MyISAM tables if needed
# the first time they are touched
myisam-recover-options  = BACKUP
max_connections        = 300
#table_cache            = 64
#thread_concurrency     = 10
table_open_cache        = 900
tmp_table_size          = 514M
max_heap_table_size     = 514M
innodb_buffer_pool_size = 724M
join_buffer_size        = 16M

#
# * Query Cache Configuration
#
query_cache_limit       = 16M
query_cache_size        = 32M
query_cache_type        = 1

log_error = /var/log/mysql/error.log
slow_query_log = 1
slow_query_log_file = /var/log/mysql/slow-queries.log
long_query_time = 2
log-queries-not-using-indexes


server-id               = 1
log_bin                 = /var/log/mysql/mysql-bin.log
expire_logs_days        = 10
max_binlog_size   = 100M

Here is my db-config.php per request in comments

<?php

/**
 * LudicrousDB configuration file
 *
 * This file should be copied to ABSPATH/db-config.php and modified to suit your
 * database environment. This file comes with a basic configuration by default.
 *
 * See README.md for documentation.
 */

// Exit if accessed directly
defined( 'ABSPATH' ) || exit;

/**
 * charset (string)
 * This sets the default character set. Since WordPress 4.2, the suggested
 * setting is "utf8mb4". We strongly recommend not downgrading to utf8,
 * using latin1, or sticking to the default: utf8mb4.
 *
 * Default: utf8mb4
 */
$wpdb->charset = 'utf8mb4';

/**
 * collate (string)
 * This sets the default column collation. For best results, investigate which
 * collation is recommended for your specific character set.
 *
 * Default: utf8mb4_unicode_ci
 */
$wpdb->collate = 'utf8mb4_unicode_ci';

/**
 * save_queries (bool)
 * This is useful for debugging. Queries are saved in $wpdb->queries. It is not
 * a constant because you might want to use it momentarily.
 * Default: false
 */
$wpdb->save_queries = false;

/**
 * persistent (bool)
 * This determines whether to use mysql_connect or mysql_pconnect. The effects
 * of this setting may vary and should be carefully tested.
 * Default: false
 */
$wpdb->persistent = false;

/**
 * max_connections (int)
 * This is the number of mysql connections to keep open. Increase if you expect
 * to reuse a lot of connections to different servers. This is ignored if you
 * enable persistent connections.
 * Default: 10
 */
$wpdb->max_connections = 10;

/**
 * check_tcp_responsiveness
 * Enables checking TCP responsiveness by fsockopen prior to mysql_connect or
 * mysql_pconnect. This was added because PHP's mysql functions do not provide
 * a variable timeout setting. Disabling it may improve average performance by
 * a very tiny margin but lose protection against connections failing slowly.
 * Default: true
 */
$wpdb->check_tcp_responsiveness = true;

/**
 * This adds the same server again, only this time it is configured as a slave.
 * The last three parameters are set to the defaults but are shown for clarity.
 */
$wpdb->add_database( array(
    'host'     => 'X.X.X.X',     // If port is other than 3306, use host:port.
    'user'     => DB_USER,
    'password' => DB_PASSWORD,
    'name'     => DB_NAME,
    'write'    => 0,
    'read'     => 1,
    'dataset'  => 'global',
    'timeout'  => 0.2,
) );

$wpdb->add_database( array(
    'host'     => 'X.X.X.X',     // If port is other than 3306, use host:port.
    'user'     => DB_USER,
    'password' => DB_PASSWORD,
    'name'     => DB_NAME,
    'write'    => 1,
    'read'     => 0,
    'dataset'  => 'global',
    'timeout'  => 0.2,
) );

Best Answer

In the end, it turned out to be connection limitation. It's turned out that the new server had a miss-configuration of the network card to only serve 100M and not 1Gb of traffic as expected.

When I checked ifstat I have found that when I send all requests (read and write) into the Master the network card gets saturated to 12Kbps which is 100Mb card. And that is what creates the cascade effect above.