MySQL 5.1 to 5.6 – How to Fix Massive Performance Hit

innodbmyisamMySQLmysqldumpperformance

Okay, this is possibly three questions. I want to move my existing MySQL 5.1 database using MyISAM to 5.6 using InnoDB for what I think are a host of obvious — and possibly even good — reasons.

This is on Amazon RDS, so my upgrade route is limited to dumping and recreating the database.

I will cheerfully admit I'm not a sophisticated DBA.

Problem 1: Wow is that slow!

It takes about 15 minutes to mysqldump our 160-million-odd rows. (Show table etc is coming, hold your horses.)

It took something like 50 hours to load it into a mysql 5.6 instance with engine craftily sed-script-ed to InnoDB.

Problem 2: Where are my rows?

select count(*) from node; on the current DB gives about 162 million. On the 5.6, it gives about 93 million. The load appeared to be successful, although I can't prove it; at least, there was no error message after the load terminated.

If it wasn't successful, that was really slow.

Problem 3: WOW is that slow!

So, select count(*) from node; completes in about no time at all — between 0.00 and 0.03 seconds by the query results — on 5.1. On 5.6 with InnoDB, it takes over a minute. The explain makes it obvious this is because of a difference in the way the query is optimized — but unclear why it's different.

Tables and explains

MySQL 5.1

mysql> show create table node\G
*************************** 1. row ***************************
       Table: node
Create Table: CREATE TABLE `node` (
  `id` bigint(20) NOT NULL AUTO_INCREMENT,
  `graph` varchar(100) CHARACTER SET latin1 DEFAULT NULL,
  `subject` varchar(200) NOT NULL,
  `predicate` varchar(200) NOT NULL,
  `object` mediumtext NOT NULL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `nodeindex` (`graph`(20),`subject`(100),`predicate`(100),`object`(100)),
  KEY `ix_node_subject` (`subject`),
  KEY `ix_node_graph` (`graph`),
  KEY `ix_node_object` (`object`(255)),
  KEY `ix_node_predicate` (`predicate`),
  KEY `node_po` (`predicate`,`object`(130)),
  KEY `node_so` (`subject`,`object`(130)),
  KEY `node_sp` (`subject`,`predicate`(130)),
  FULLTEXT KEY `node_search` (`object`)
) ENGINE=MyISAM AUTO_INCREMENT=550671861 DEFAULT CHARSET=utf8
1 row in set (0.00 sec)

mysql> select count(id) from node;
+-----------+
| count(id) |
+-----------+
| 163426434 |
+-----------+
1 row in set (0.00 sec)


mysql> explain select count(id) from node;
+----+-------------+-------+------+---------------+------+---------+------+------+------------------------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref  | rows | Extra                        |
+----+-------------+-------+------+---------------+------+---------+------+------+------------------------------+
|  1 | SIMPLE      | NULL  | NULL | NULL          | NULL | NULL    | NULL | NULL | Select tables optimized away |
+----+-------------+-------+------+---------------+------+---------+------+------+------------------------------+
1 row in set (0.00 sec)

MySQL 5.6

mysql> show create table node\G
*************************** 1. row ***************************
       Table: node
Create Table: CREATE TABLE `node` (
  `id` bigint(20) NOT NULL AUTO_INCREMENT,
  `graph` varchar(100) CHARACTER SET latin1 DEFAULT NULL,
  `subject` varchar(200) NOT NULL,
  `predicate` varchar(200) NOT NULL,
  `object` mediumtext NOT NULL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `nodeindex` (`graph`(20),`subject`(100),`predicate`(100),`object`(100)),
  KEY `ix_node_subject` (`subject`),
  KEY `ix_node_graph` (`graph`),
  KEY `ix_node_object` (`object`(255)),
  KEY `ix_node_predicate` (`predicate`),
  KEY `node_po` (`predicate`,`object`(130)),
  KEY `node_so` (`subject`,`object`(130)),
  KEY `node_sp` (`subject`,`predicate`(130)),
  FULLTEXT KEY `node_search` (`object`)
) ENGINE=InnoDB AUTO_INCREMENT=481239575 DEFAULT CHARSET=utf8
1 row in set (0.00 sec)

“`

mysql> explain select count(id) from node;
+----+-------------+-------+-------+---------------+---------------+---------+------+----------+-------------+
| id | select_type | table | type  | possible_keys | key           | key_len | ref  | rows     | Extra       |
+----+-------------+-------+-------+---------------+---------------+---------+------+----------+-------------+
|  1 | SIMPLE      | node  | index | NULL          | ix_node_graph | 103     | NULL | 79671827 | Using index |
+----+-------------+-------+-------+---------------+---------------+---------+------+----------+-------------+
1 row in set (0.00 sec)

Best Answer

Your issues have nothing to due with the version of MySQL. It has to do with the Storage Engine.

Answer to PROBLEM #1 : Wow is that slow!

Running mysqldump only touches data from the .MYD file of the MyISAM table. Thus, I do not find anything surprising about dumping 163 million rows in 15 minutes.

Loading data into Amazon RDS taking 50 hours is not shocking to me either. Why ?

Regardless of which server model you choose for MySQL RDS, InnoDB transaction logs (ib_logfile0, ib_logfile1) are always 128M and are not allowed to be changed, not even which the RDS CLI. I wrote about this before : Local database vs Amazon RDS

All writes to InnoDB are written to the Double Write Buffer> You should disable it before loading : See my post Possibilities to speed up InnoDB INSERTs and UPDATEs

Each chunk of rows from each INSERT is handled as a transaction with stuff being written through ibdata1's double write buffer and the transaction logs. Thus, the slowness.

Answer to PROBLEM #2 : Where are my rows?

Look at the nodeindex. I can see it is a prefix index.

According to the MySQL Documentation on CREATE INDEX:

Prefix support and lengths of prefixes (where supported) are storage engine dependent. For example, a prefix can be up to 1000 bytes long for MyISAM tables, and 767 bytes for InnoDB tables.

I can almost guarantee that any row where the length of graph,subject,predicate,object exceeds 767 did not make it into the InnoDB table .

Answer to PROBLEM #3 : Wow is that slow!

This is due to the Storage Engine.

When you run select count(id) from node; against a MyISAM, MyISAM cheats and reaches into the .MYD header to get the row count. Thus, the running time for getting a row count is not a function of the actual numbers of rows. That's way the MySQL Query Optimizer optimizes away all standard mechanisms and gives you a row count.

When it comes to InnoDB, because it does not store the row count, a table must be fully scanned each time : See my post Why doesn't InnoDB store the row count?

SUGGESTION

I would not import it as InnoDB. I would import is MyISAM first. Then, convert all your MyISAM tables to InnoDB. Before converting it, you may have to change the nodeindex or get rid of it altogether. Otherwise, you will lose the rows upon conversion.

See my post Which first: upgrade mysql version or convert storage engine? for more information.