Mysql – What causes InnoDB to write 100% more pages while slowing down answering queries

database-tuninginnodbMySQLperformance

I have a somewhat big database server: 4 processors, 32 cores, 288GB RAM, 2 ethernet cards bounded together, 2 independent RAID controller cards with 1GB cache each, 24 2.5" disks, being 8 SAS, 15KRPM, in the data partition in RAID10, and 3 SSD, in the Journal partition in RAID5, and 2 SAS, 15KRPM in RAID 1 for the operating system. The data partition has it's own RAID controller, and the O.S. and Journal share the other card.

I'm running Ubuntu 12.04.1 LTS and MySQL 5.1.56 with InnoDB plugin 12.7 from Percona on top of that.

To the Data Partition MySQL writes indexes and data files; to the Journal partition it writes binary logs, slow and error logs, and innodb journal files.

In the last few weeks I've been observing some weird behaviour: MySQL slows down answering to queries, and at the same time doubles the "InnoDB Buffer Pool Pages Written" metric. At the same time, I see InnoDB Log buffer filling up 5 times the usual levels for the usual workload. Also, I don't observe any mutex and lock granting activity changes on the graph.

This is my current configuration:

  basedir=/usr/
  datadir=/var/lib/mysql/data
  tmpdir=/var/lib/mysql/tmp
  server-id=1
  socket=/var/run/mysqld/mysqld.sock
  port=3306
  user=mysql
  pid-file=/var/run/mysqld/mysqld.pid
  language=/usr/share/mysql/english
  default-storage-engine=InnoDB
  read_only
  max_heap_table_size=256M
  tmp_table_size=256M
  table_cache=1024
  thread_cache_size=64
  thread_stack=1024K
  max_allowed_packet=16M
  max_connections=255
  max_user_connections=250
  skip-external-locking
  skip-slave-start
  master-info-file=/var/lib/mysql/relay/master.info
  relay-log=/var/lib/mysql/relay/relay-bin
  relay-log-index=/var/lib/mysql/relay/relay-bin.index
  relay-log-info-file=/var/lib/mysql/relay/relay-log.info
  log-slave-updates=1
  expire_logs_days=1
  sync_binlog=1
  max_binlog_size=1G
  binlog-format=MIXED
  log-bin=/var/lib/mysql/binary/mysqld-binlog
  log-bin-index=/var/lib/mysql/binary/mysqld-binlog.index
  log-warnings=2
  log-error=/var/lib/mysql/mysqld-err.log
  slow-query-log
  log_slow_slave_statements=1
  log_slow_timestamp_every=1
  slow_query_log_microseconds_timestamp=1
  log_slow_verbosity=full
  long-query-time=0.05
  slow-query-log-file=/var/lib/mysql/mysqld-slow.log
  innodb_adaptive_flushing=1
  innodb_additional_mem_pool_size=20M
  innodb_buffer_pool_size=16G
  innodb_data_file_path=ibdata1:20M:autoextend
  innodb_data_home_dir=/var/lib/mysql/data
  innodb_doublewrite_file=/var/lib/mysql/journal/ib_doublewrite
  innodb_fast_shutdown=0
  innodb_file_per_table
  innodb_flush_log_at_trx_commit=1
  innodb_flush_method=O_DIRECT
  innodb_io_capacity=1500
  innodb_log_group_home_dir=/var/lib/mysql/journal/
  innodb_max_dirty_pages_pct=75
  innodb_open_files=1024
  innodb_rollback_on_timeout
  innodb_thread_concurrency=20
  query_cache_size=0
  query_cache_type=0
  key-buffer-size=200M
  server-id=233111
  sql-mode=NO_AUTO_CREATE_USER
  max_connections=850
  max_user_connections=800
  read-only
  table-open-cache=1300
  log-error=/var/lib/mysql/log/dbserver-err.log
  slow-query-log-file=/var/lib/mysql/log/dbserver-slow.log
  relay-log=/var/lib/mysql/relay/dbserver-relay-bin
  relay-log-index=/var/lib/mysql/relay/dbserver-relay-bin.index
  relay-log-info-file=/var/lib/mysql/relay/dbserver-relay-log.info
  log-bin=/var/lib/mysql/binary/dbserver-mysqld-binlog
  log-bin-index=/var/lib/mysql/binary/dbserver-mysqld-binlog.index
  relay_log_purge=0
  innodb_buffer_pool_size=240G
  innodb_log_buffer_size=2G
  innodb_log_file_size=4G
  large-pages

And this is my current filesystem organization:

  SSD RAID5, Controller #0
  /var/lib/mysql/journal -> /srv/mysql/ssd/journal
  /var/lib/mysql/log     -> /srv/mysql/ssd/log
  /var/lib/mysql/relay   -> /srv/mysql/ssd/relay
  /var/lib/mysql/tmp     -> /srv/mysql/ssd/tmp

  SAS RAID 10, Controller #1
  /var/lib/mysql/backup  -> /srv/mysql/sas/backup
  /var/lib/mysql/binary  -> /srv/mysql/ssd/binary
  /var/lib/mysql/data    -> /srv/mysql/sas/data

Can you please help me understanding what is going on with my database server?
Why is it slowing down service and increasing the InnoDB Buffer Pool Write activity?

Best Answer

The symptoms I've describe are matching with dirty-page-flushing issues in my InnoDB Buffer pool. Dirty pages get flushed from the InnoDB Buffer Pool, among other situations, when InnoDB recycles one of it's Journal Files. This cause en-mass flushing of Data Pages to the disk, which causes I/O Bursts.

The links below suggest some insight on the problem.

http://www.mysqlperformanceblog.com/2008/11/13/adaptive-checkpointing/ http://dimitrik.free.fr/blog/archives/07-01-2010_07-31-2010.html http://www.chriscalender.com/?p=201

The solution passes through choosing either:

  • Increase the number of the InnoDB Log Files in use, in order to increase the Maximum Checkpoint Age available;
  • Change the Checkpointing strategy (variable innodb_adaptive_checkpoint) to a more aggressive strategy;
  • Tuning innodb_io_capacity and the size of InnoDB Log File and Buffer sizes.