Mysql – Huge MySQL slave replication lag (days) at almost idle servers. Slave is mostly ‘Waiting for dependent transaction to commit’

MySQL

I have a strange situation with MySQL replication lag happening on relatively idle servers.

Slave shows no symptoms of IO saturation (according to top, iotop or iostat), no symptoms of CPU saturation. If I do SHOW SLAVE STATUS at regular intervals, it seems, that slave spends 99% of it's time at two events:

  • Waiting for slave workers to process their queues
  • Waiting for dependent transaction to commit (happening most often).

The problem is that the database is about 1Tb in size and it's not easy experimenting with it. Both master and slave are Xeon E5 with SSD RAID10 storage. Investigation showed no lagging of slave IO thread. Everything looks like slave SQL thread is constantly waiting for some event to complete before replaying next transaction from log.

Replication type used is row-based. We tried to switch to slave_parallel_type = LOGICAL_CLOCK and set slave_parallel_workers = 24 but with no success. We see no server resource over-utilization. All metrics show that the server is mostly idle. log_slow_slave_statements shows nothing. innodb_flush_log_at_trx_commit = 2 to prevent fsyncs() on small transactions (not that it was necessary because we are nowhere near iops limit but we wanted to try something).

Putting additional load to server does not affect replication lag.

Database being replicated contains about 95% of InnoDB tables and some TokuDB tables.

This situation started about several months ago at a time point none can remember. Despite of the fact, that it is a visible point in Grafana for us, we can't remember or find something in logs that could bring us closer to the problem cause.

MySQL server version is 5.7 by Percona.

Best Answer

Check these two server parameters (my.cnf):

  1. slave_pending_jobs_size_max should be same or more than the value of max_allowed_packet https://dev.mysql.com/doc/refman/8.0/en/replication-options-slave.html#sysvar_slave_pending_jobs_size_max

  2. slave_transaction_retries value should be twice of slave_parallel_workers.