MySQL INNODB corruption after server crash during concurrent truncate command

corruptioncrashindexinnodbMySQL

My server crashed today I think due to a concurrent truncate table command on one of our INNODB tables. The server could be restarted, but after it starts up, everytime I try to issue an SQL command, I get the following error:

ERROR 2006 (HY000): MySQL server has gone away

This is what happened in the logs:

121206 01:11:12  mysqld restarted
121206  1:11:13  InnoDB: Started; log sequence number 275 559321759
InnoDB: !!! innodb_force_recovery is set to 1 !!!
121206  1:11:13 [Note] /usr/libexec/mysqld: ready for connections.
Version: '5.0.95-log'  socket: '/var/lib/mysql/mysql.sock'  port: 3306  Source distribution
InnoDB: Error: trying to load index PRIMARY for table 
InnoDB: but the index tree has been freed!
121206  1:11:37 - mysqld got signal 11 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help diagnose
the problem, but since we have already crashed, something is definitely wrong
and this may fail.

key_buffer_size=134217728
read_buffer_size=1048576
max_used_connections=1
max_connections=400
threads_connected=1
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_connections = 950272 K
bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

thd=0x9900950
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
Cannot determine thread, fp=0x46353fa0, backtrace may not be correct.
Stack range sanity check OK, backtrace follows:
(nil)
New value of fp=0x9900950 failed sanity check, terminating stack trace!
Please read http://dev.mysql.com/doc/mysql/en/using-stack-trace.html and follow instructions on how to resolve the stack trace. Resolved
stack trace is much more helpful in diagnosing the problem, so please do
resolve it
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort...
thd->query at 0x993e500 =
thd->thread_id=1
The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
information that should help you find out what is causing the crash.

I have searched online and I get the hint it is a MySQL bug, but I have no idea how to solve it. I am using MySQL version 5.0.95.

It seems like I have to create a new database and dump the old data into the new one, but how can I do that if I can't even issue any SQL commands to the current one?

— UPDATE —
Version: '5.0.95-log' socket: '/var/lib/mysql/mysql.sock' port: 3306 Source distribution
InnoDB: Error: trying to load index PRIMARY for table
InnoDB: but the index tree has been freed!
121206 4:13:41 – mysqld got signal 11 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help diagnose
the problem, but since we have already crashed, something is definitely wrong
and this may fail.

key_buffer_size=134217728
read_buffer_size=1048576
max_used_connections=1
max_connections=400
threads_connected=1
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_connections = 950272 K
bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

thd=0x17fb8950
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
Cannot determine thread, fp=0x464a3fa0, backtrace may not be correct.
Stack range sanity check OK, backtrace follows:
(nil)
New value of fp=0x17fb8950 failed sanity check, terminating stack trace!
Please read http://dev.mysql.com/doc/mysql/en/using-stack-trace.html and follow instructions on how to resolve the stack trace. Resolved
stack trace is much more helpful in diagnosing the problem, so please do
resolve it
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort...
thd->query at 0x17ff6500 =
thd->thread_id=3
The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
information that should help you find out what is causing the crash.

Number of processes running now: 0
121206 04:13:41  mysqld restarted
InnoDB: The log sequence number in ibdata files does not match
InnoDB: the log sequence number in the ib_logfiles!
121206  4:13:42  InnoDB: Database was not shut down normally!
InnoDB: Starting crash recovery.
InnoDB: Reading tablespace information from the .ibd files...
InnoDB: Restoring possible half-written data pages from the doublewrite
InnoDB: buffer...
121206  4:13:43  InnoDB: Started; log sequence number 275 559323148
121206  4:13:43 [Note] /usr/libexec/mysqld: ready for connections.
Version: '5.0.95-log'  socket: '/var/lib/mysql/mysql.sock'  port: 3306  Source distribution

Best Answer

ASPECT #1

The first thing that caught my eye was this line

InnoDB: Error: trying to load index PRIMARY for table /

This indicates you have a table using the InnoDB Storage Engine

What is interesting about InnoDB is the way a PRIMARY KEY is stored. It is stored in a structure called the gen_clust_index, or more commonly known as the Clustered Index.

My immediate guess is that a certain PRIMARY KEY entry is too big

Please consider some articles on the good, the bad, and the ugly of using long PRIMARY KEYs:

then see if the <DB Hidden>.<Table Hidden> needs to be redesigned.

ASPECT #2

In terms of your conjecture concerning a parallel truncate table, that sounds kind of dangerous. Why? InnoDB performs TRUNCATE TABLE as DDL not DML. I have written about this before:

ASPECT #3

Some tuning suggestions

Please add the following to my.ini

[mysqld]
max_allowed_packet=1G
innodb_fast_shutdown=0

Start mysql

In another session, run tail -f <errorlogfile> and watch InnoDB Crash Recovery.

If mysql is fully started back up and InnoDB crash recovery has completed, try to shut mysql down immediately. You may need to resize your InnoDB Transaction Logs.

Sorry for these wild suggestions, but I am flying blind here.

Please post the following in the question:

  • your entire my.cnf
  • how much RAM is on board

UPDATE 2012-12-05 12:09 EDT

Please do the following:

STEP 01) Add these changes to my.cnf

[mysqld]
max_allowed_packet=1G
innodb_fast_shutdown=0
innodb_thread_concurrency=0

STEP 02) service mysql restart

to make sure mysql comes up

STEP 03) You need to resize ib_logfile0 and ib_logfile1 (24M might be too small)

service mysql stop
cd /var/lib/mysql
mv ib_logfile0 ib_logfile0.bak
mv ib_logfile1 ib_logfile1.bak

STEP 04) Add these changes to my.cnf

[mysqld]
innodb_log_file_size=512M
innodb_log_buffer_size=8M

STEP 05) service mysql start

mysqld will recreate ib_logfile0 and ib_logfile1 512M each

Now, try and see what happens....

UPDATE 2012-12-05 12:18 EDT

In the meantime, please read my ServerFault post on the mysql packet and its sizing implication with regard to the innodb_log_file_size and innodb_log_buffer_size as I learned from someone else's ServerFault post.

UPDATE 2012-12-05 14:28 EDT

I edited all references to customer tables out of this question.

The root cause was a damaged page in ibdata1 with data and index pages mixed inside. I helped Andrew migrate data out, recreate ibdata1 with innodb_file_per_table, and Andrew reloaded the data.