Mysql – What are the causes of InnoDB Index Corruption

indexinnodbMySQLmysql-5.5percona-server

I'm running into issues where many of my database servers are experiencing repeat index corruption. The index corruption across several physical hosts, and many different tables. Whenever I restore a new server, it goes corrupt several days latter.

Normally I see index corruption on my read only slave, followed by the master less than a day later.

I am running Percona 5.5.51-38.1 on CentOS 6, on a VMWare host.

My errors mostly look like this (although I've also seen other errors):

InnoDB: End of page dump
161008 10:25:47  InnoDB: Page checksum 371733204 (32bit_calc: 1567583928), prior-to-4.0.14-form checksum 1175312553
InnoDB: stored checksum 1215686486, prior-to-4.0.14-form stored checksum 0
InnoDB: Page lsn 23 2865603967, low 4 bytes of lsn at page end 0
InnoDB: Page number (if stored to page already) 176,
InnoDB: space id (if created with >= MySQL-4.1.1 and stored already) 3784
InnoDB: Page may be an index page where index id is 10061
InnoDB: (index "key2" of table "my_database"."my_table")
InnoDB: Corruption of an index tree: table "my_database"."my_table", index "key2",
InnoDB: father ptr page no 9116, child page no 9118
PHYSICAL RECORD: n_fields 4; compact format; info bits 0
 0: len 4; hex cc7dc1b0; asc  }  ;;
 1: len 4; hex 8003f1ec; asc     ;;
 2: len 4; hex 80000031; asc    1;;
 3: len 4; hex 80117f84; asc     ;;
 n_owned: 0; heap_no: 2; next rec: 146
PHYSICAL RECORD: n_fields 5; compact format; info bits 0
 0: len 4; hex cc7c7030; asc  |p0;;
 1: len 4; hex 8001172e; asc    .;;
 2: len 4; hex 80000031; asc    1;;
 3: len 4; hex 80117d0a; asc   } ;;
 4: len 4; hex 0000239c; asc   # ;;
 n_owned: 6; heap_no: 483; next rec: 12200
InnoDB: You should dump + drop + reimport the table to fix the
InnoDB: corruption. If the crash happens at the database startup, see
InnoDB: http://dev.mysql.com/doc/refman/5.5/en/forcing-innodb-recovery.html about
InnoDB: forcing recovery. Then dump + drop + reimport.
161008 10:25:47  InnoDB: Assertion failure in thread 139950214960896 in file btr0btr.c line 1330
InnoDB: We intentionally generate a memory trap.
InnoDB: Submit a detailed bug report to http://bugs.mysql.com.
InnoDB: If you get repeated assertion failures or crashes, even
InnoDB: immediately after the mysqld startup, there may be
InnoDB: corruption in the InnoDB tablespace. Please refer to
InnoDB: http://dev.mysql.com/doc/refman/5.5/en/forcing-innodb-recovery.html
InnoDB: about forcing recovery.
13:25:47 UTC - mysqld got signal 6 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed, 
something is definitely wrong and this may fail.
Please help us make Percona Server better by reporting any
bugs at http://bugs.percona.com/

And here is my.cnf:

[client]
port                            = 3306
socket                          = /var/lib/mysql/mysql.sock

[mysqld]
general_log = on
user                            = mysql
read_only                       = 0    
port                            = 3306
socket                          = /var/lib/mysql/mysql.sock
datadir                         = /var/lib/mysql
symbolic-links                  = 0
skip-external-locking
key_buffer_size                 = 32M
max_allowed_packet              = 128M
table_open_cache                = 10000
sort_buffer_size                = 2M
read_buffer_size                = 2M
read_rnd_buffer_size            = 8M
myisam_sort_buffer_size         = 64M
thread_cache_size               = 8
query_cache_size                = 32M
thread_concurrency              = 8
log-bin                         = mysql-bin 
innodb_buffer_pool_size         = 8192M
innodb_data_home_dir            = /var/lib/mysql
innodb_data_file_path           = ibdata1:10M:autoextend
innodb_log_group_home_dir       = /var/lib/mysql
innodb_additional_mem_pool_size = 20M
innodb_log_file_size            = 1000M
innodb_log_buffer_size          = 8M
innodb_flush_log_at_trx_commit  = 2
innodb_lock_wait_timeout        = 50
tmpdir                          = /var/lib/mysql

pid-file                        = mysql.pid
log-error                       = mysql.err
max_binlog_size                 = 100M
log_bin_trust_function_creators = 1
expire_logs_days                = 3
max_connections                 = 2000
max_connect_errors              = 10000
lower_case_table_names          = 1
default-storage-engine          = innodb
innodb_file_format              = Barracuda
innodb_file_per_table           = 1
innodb_status_file              = 1
innodb_flush_method             = O_DIRECT
slow_query_log_file             = slow-query.log
slow_query_log                  = 0        
long_query_time                 = 10        
skip-networking                 = 0        

relay_log                       = relay-bin
server-id                       = 2         
read-only                       = 1         
skip-slave-start                = 1        

I'm mostly trying to get a list of possible causes so I can keep investigating. Initially we were running 5.5.31 with Antelope, but we upgraded to the latest 5.5 and changed to Barracuda, but that did not help.

I suspect we're hitting an edge case in MySQL, but we must be doing something to trigger it.

Best Answer

When you said InnoDB Index Corruption, I immediately thought of the InnoDB Buffer Pool

Let's start with what InnoDB Buffer Pool actually holds. Please take a look at the upper left hand corner of this Pictorial Representation of InnoDB (courtesy of Percona TCO Vadim Tkachenko)

InnoDB Plumbing

The InnoDB Buffer Pool has a section called the Change Buffer (a.k.a. Insert Buffer, which is dedicated to updating changes to non-unique indexes. Note how those changes are moved from the Buffer Pool into the System Tablespace (ibdata1). A lot of work adjusting non-unique indexes is involved. Note the MySQL Documentation Clustered and Secondary Indexes under the subheading How Secondary Indexes Relate to the Clustered Index :

All indexes other than the clustered index are known as secondary indexes. In InnoDB, each record in a secondary index contains the primary key columns for the row, as well as the columns specified for the secondary index. InnoDB uses this primary key value to search for the row in the clustered index.

If the primary key is long, the secondary indexes use more space, so it is advantageous to have a short primary key.

CONJECTURE #1

If you have large PRIMARY KEYs, I suspect the Change Buffer becoming a bit of a hog within the Buffer Pool. Changes can reach up to 50% of the Buffer Pool. You can tune that down with innodb_ibuf_max_size. The default is half the buffer pool. In your case, that would be 4096M (4G). Perhaps lowering it could throttle the amount of index maintenance needed.

CONJECTURE #2

I don't see innodb_buffer_pool_instances configured. For MySQL 5.5, the default is 1. You have innodb_buffer_pool_size set at 8192M (8G). If the Buffer Pool is more that half the installed RAM, YIKES !!! You will experience lots of swap. I recommend setting it to 2 or 4 or the number of cores assigned to the VM. I mentioned this back on Feb 12, 2011 (How do you tune MySQL for a heavy InnoDB workload?)

SUGGESTIONS

Please do one or more of the following