Mysql slow slave

MySQL

Background:

We have our main databases at a data centre which are all working fine. I wanted to add a slave server at our office for backup purposes. I have copied all of the data from an existing slave at the DC (/var/log/mysql,/var/lib/mysql,/etc/mysql/my.cnf) over to the office server. The office server also has the same spec as the slaves at the DC. 4 Core Processor, 4Gb of RAM and RAID 10 SSD drives.

Problem

The slave at the office is taking ages to catch up to the master and im struggling to see why. It definetly isn't comms as when I look in the /var/log/mysql folder I can see that the latest mysql relay file has upto the second data inside it although when I run mysql slave status it is still bust reading data from a relay log which has data from the day before.

When I run show processlist nothing seems to appear so it seems like the queries are getting executed quickly and also nothing appears in the slow query log.

I have used dd to test the speed of writing to the hard drive and it is taking 6 seconds to write a gig of data to the drive where the data centre is taking 5.5 so I can't see that this is a problem either.

Here is the contents of the my.cnf file. I have done the exact same things on cloud servers at rackspace and not run into this problem before.

Any help would be much appreciated.

[client]
port            = 3306
socket          = /var/run/mysqld/mysqld.sock

# Here is entries for some specific programs
# The following values assume you have at least 32M ram

# This was formally known as [safe_mysqld]. Both versions are currently parsed.
[mysqld_safe]
socket          = /var/run/mysqld/mysqld.sock
nice            = 0

[mysqld]

skip-slave-start
log-bin
expire_logs_days = 0
max_binlog_size = 100M

user            = mysql
pid-file        = /var/run/mysqld/mysqld.pid
socket          = /var/run/mysqld/mysqld.sock
port            = 3306
basedir         = /usr
datadir        = /var/lib/mysql
tmpdir          = /tmp
skip-external-locking

# Charset and Collation
character-set-server = utf8
collation-server     = utf8_general_ci
event_scheduler      = 0



#
# Instead of skip-networking the default is now to listen only on
# localhost which is more compatible and is not less secure.
#bind-address            = 0.0.0.0
#
# * Fine Tuning
#
key_buffer_size         = 256M
max_allowed_packet      = 16M
thread_stack            = 256K
thread_cache_size       = 8
sort_buffer_size        = 2M
read_buffer_size        = 128k
read_rnd_buffer_size    = 256k
join_buffer_size        = 128k

auto-increment-increment = 1
auto-increment-offset    = 1

# This replaces the startup script and checks MyISAM tables if needed
# the first time they are touched
myisam-recover          = BACKUP
max_connections         = 800
max_connect_errors      = 10
concurrent_insert       = 2
connect_timeout         = 10
wait_timeout            = 180
net_read_timeout        = 30
net_write_timeout       = 30
back_log                = 128
table_cache             = 128
table_open_cache        = 128
tmp_table_size          = 32M
max_heap_table_size     = 32M
bulk_insert_buffer_size = 32M
open-files-limit        = 1024


query_cache_limit       = 1M
query_cache_size        = 16M

log_slow_queries        = /var/log/mysql/slow.log
long_query_time         = 2

server-id = 100
binlog_format           = statement
log-slave-updates       = ON
log_bin                 = /var/log/mysql/mysql-bin.log
expire_logs_days        = 10
max_binlog_size         = 100M
binlog_cache_size       = 32K
relay_log               = /var/log/mysql/mysql-relay-bin.log
sync_binlog             = 0
slave_compressed_protocol = 0
innodb_data_home_dir            = /var/lib/mysql
innodb_log_group_home_dir       = /var/lib/mysql
innodb_file_per_table
innodb_table_locks              = true
innodb_lock_wait_timeout        = 60
innodb_thread_concurrency       = 4
innodb_commit_concurrency       = 4
innodb_support_xa               = true

innodb_buffer_pool_size = 128M
innodb_log_file_size    = 2000M
innodb_additional_mem_pool_size = 8M
innodb_data_file_path   = ibdata1:10M:autoextend
innodb_flush_log_at_trx_commit = 1
innodb_log_buffer_size  = 8M

[mysqldump]
quick
quote-names
max_allowed_packet      = 16M

[mysql]
#no-auto-rehash # faster start of mysql but no tab completition

[myisamchk]
key_buffer                = 16M

myisam_sort_buffer_size   = 8M
myisam_max_sort_file_size = 2147483648
myisam_repair_threads     = 1
myisam-recover            = BACKUP

Best Answer

Slave lag is a very common problem, the reason usually, as you have checked, is not the bandwidth or latency in the communication between servers (needed by the IO/Thread), but the other thread involved, the one that applies its own relay logs to the server: the SQL Thread.

Until recently, and with the default configuration, the only way to apply committed transactions from the master on the slave is using a single thread, which means that, if the master has an average concurrency > 1 or if there are some long-running transactions on the master (ALTER TABLE, UPDATEs of multiple rows, etc.), the slave may lag.

There are several things that you can do to avoid this problem:

  • Change binlog_format = STATEMENT on the master to binlog_format = ROW. This will minimize the latency of query applying, as it will only write the row almost always using the primary key with point writes. The main disadvantage is that if you modify lots of rows per query, the bandwidth used will be higher- also, it is a bit less flexible. On the slave, the binlog is technically not necessary, unless you want to perform point-in-time recovery on that server or you will promote it to a master in the future. In that case, you can disable the binlog to gain some write performance.
  • If you are using several schemas and MySQL 5.6, you can enable the parallel processing of transaction with the slave_parallel_workers to >1. Please note that only 5.7 will allow real parallelism -5.6 requires changes being done on separate databases in order to take advantage of this feature.
  • Reduce the write overhead of the slave- set innodb_flush_log_at_trx_commit = 2 (or 0) this will reduce the durability of the slave in case of a crash, but will improve significantly the write rate, and you already have the master providing full durability. There are more things that you can do regarding the write performance of the slave, such as disabling the binlog and the slow log, making sure you are using group commit, or setting the transaction log on its separate drive, etc.

However, there are other things that are very suspicious of causing bad performance (both on the master and the slave), such as:

innodb_thread_concurrency       = 4
innodb_commit_concurrency       = 4
innodb_support_xa               = true
innodb_buffer_pool_size         = 128M

But that requires a deeper understanding of your system, application and hardware.