I have a somewhat big database server: 4 processors, 32 cores, 288GB RAM, 2 ethernet cards bounded together, 2 independent RAID controller cards with 1GB cache each, 24 2.5" disks, being 8 SAS, 15KRPM, in the data partition in RAID10, and 3 SSD, in the Journal partition in RAID5, and 2 SAS, 15KRPM in RAID 1 for the operating system. The data partition has it's own RAID controller, and the O.S. and Journal share the other card.
I'm running Ubuntu 12.04.1 LTS and MySQL 5.1.56 with InnoDB plugin 12.7 from Percona on top of that.
To the Data Partition MySQL writes indexes and data files; to the Journal partition it writes binary logs, slow and error logs, and innodb journal files.
In the last few weeks I've been observing some weird behaviour: MySQL slows down answering to queries, and at the same time doubles the "InnoDB Buffer Pool Pages Written" metric. At the same time, I see InnoDB Log buffer filling up 5 times the usual levels for the usual workload. Also, I don't observe any mutex and lock granting activity changes on the graph.
This is my current configuration:
basedir=/usr/
datadir=/var/lib/mysql/data
tmpdir=/var/lib/mysql/tmp
server-id=1
socket=/var/run/mysqld/mysqld.sock
port=3306
user=mysql
pid-file=/var/run/mysqld/mysqld.pid
language=/usr/share/mysql/english
default-storage-engine=InnoDB
read_only
max_heap_table_size=256M
tmp_table_size=256M
table_cache=1024
thread_cache_size=64
thread_stack=1024K
max_allowed_packet=16M
max_connections=255
max_user_connections=250
skip-external-locking
skip-slave-start
master-info-file=/var/lib/mysql/relay/master.info
relay-log=/var/lib/mysql/relay/relay-bin
relay-log-index=/var/lib/mysql/relay/relay-bin.index
relay-log-info-file=/var/lib/mysql/relay/relay-log.info
log-slave-updates=1
expire_logs_days=1
sync_binlog=1
max_binlog_size=1G
binlog-format=MIXED
log-bin=/var/lib/mysql/binary/mysqld-binlog
log-bin-index=/var/lib/mysql/binary/mysqld-binlog.index
log-warnings=2
log-error=/var/lib/mysql/mysqld-err.log
slow-query-log
log_slow_slave_statements=1
log_slow_timestamp_every=1
slow_query_log_microseconds_timestamp=1
log_slow_verbosity=full
long-query-time=0.05
slow-query-log-file=/var/lib/mysql/mysqld-slow.log
innodb_adaptive_flushing=1
innodb_additional_mem_pool_size=20M
innodb_buffer_pool_size=16G
innodb_data_file_path=ibdata1:20M:autoextend
innodb_data_home_dir=/var/lib/mysql/data
innodb_doublewrite_file=/var/lib/mysql/journal/ib_doublewrite
innodb_fast_shutdown=0
innodb_file_per_table
innodb_flush_log_at_trx_commit=1
innodb_flush_method=O_DIRECT
innodb_io_capacity=1500
innodb_log_group_home_dir=/var/lib/mysql/journal/
innodb_max_dirty_pages_pct=75
innodb_open_files=1024
innodb_rollback_on_timeout
innodb_thread_concurrency=20
query_cache_size=0
query_cache_type=0
key-buffer-size=200M
server-id=233111
sql-mode=NO_AUTO_CREATE_USER
max_connections=850
max_user_connections=800
read-only
table-open-cache=1300
log-error=/var/lib/mysql/log/dbserver-err.log
slow-query-log-file=/var/lib/mysql/log/dbserver-slow.log
relay-log=/var/lib/mysql/relay/dbserver-relay-bin
relay-log-index=/var/lib/mysql/relay/dbserver-relay-bin.index
relay-log-info-file=/var/lib/mysql/relay/dbserver-relay-log.info
log-bin=/var/lib/mysql/binary/dbserver-mysqld-binlog
log-bin-index=/var/lib/mysql/binary/dbserver-mysqld-binlog.index
relay_log_purge=0
innodb_buffer_pool_size=240G
innodb_log_buffer_size=2G
innodb_log_file_size=4G
large-pages
And this is my current filesystem organization:
SSD RAID5, Controller #0
/var/lib/mysql/journal -> /srv/mysql/ssd/journal
/var/lib/mysql/log -> /srv/mysql/ssd/log
/var/lib/mysql/relay -> /srv/mysql/ssd/relay
/var/lib/mysql/tmp -> /srv/mysql/ssd/tmp
SAS RAID 10, Controller #1
/var/lib/mysql/backup -> /srv/mysql/sas/backup
/var/lib/mysql/binary -> /srv/mysql/ssd/binary
/var/lib/mysql/data -> /srv/mysql/sas/data
Can you please help me understanding what is going on with my database server?
Why is it slowing down service and increasing the InnoDB Buffer Pool Write activity?
Best Answer
The symptoms I've describe are matching with dirty-page-flushing issues in my InnoDB Buffer pool. Dirty pages get flushed from the InnoDB Buffer Pool, among other situations, when InnoDB recycles one of it's Journal Files. This cause en-mass flushing of Data Pages to the disk, which causes I/O Bursts.
The links below suggest some insight on the problem.
http://www.mysqlperformanceblog.com/2008/11/13/adaptive-checkpointing/ http://dimitrik.free.fr/blog/archives/07-01-2010_07-31-2010.html http://www.chriscalender.com/?p=201
The solution passes through choosing either: