Mysql – Restored database copy larger than original

innodbMySQL

I've created a copy of a database for testing. It's on the same server as the original. However, tables are listed as being larger on the restored copy than on the original.

My original database is 148 MB, the copy is 232 MB. Table 'message' is 15 MB on the original, 28 MB on the copy. (I'm using queries like https://stackoverflow.com/a/1733523/492455 to get these values).

The CREATE TABLE statements (SHOW CREATE TABLE) are identical on both databases (except for the AUTO_INCREMENT value). Here's one for a table (33 MB original size, 76 MB copied size:

CREATE TABLE `mdl_log` (
  `id` bigint(10) NOT NULL AUTO_INCREMENT,
  `time` bigint(10) NOT NULL DEFAULT '0',
  `userid` bigint(10) NOT NULL DEFAULT '0',
  `ip` varchar(45) COLLATE utf8_unicode_ci NOT NULL DEFAULT '',
  `course` bigint(10) NOT NULL DEFAULT '0',
  `module` varchar(20) COLLATE utf8_unicode_ci NOT NULL DEFAULT '',
  `cmid` bigint(10) NOT NULL DEFAULT '0',
  `action` varchar(40) COLLATE utf8_unicode_ci NOT NULL DEFAULT '',
  `url` varchar(100) COLLATE utf8_unicode_ci NOT NULL DEFAULT '',
  `info` varchar(255) COLLATE utf8_unicode_ci NOT NULL DEFAULT '',
  PRIMARY KEY (`id`),
  KEY `mdl_log_coumodact_ix` (`course`,`module`,`action`),
  KEY `mdl_log_tim_ix` (`time`),
  KEY `mdl_log_act_ix` (`action`),
  KEY `mdl_log_usecou_ix` (`userid`,`course`),
  KEY `mdl_log_cmi_ix` (`cmid`)
) ENGINE=InnoDB AUTO_INCREMENT=231557 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci
COMMENT='Every action is logged as far as possible'

Can anyone suggest what's going on here?

MySQL Server v5.6.24 on CentOS 6.2.

Thanks,

Leon

Best Answer

You may find this surprising but this is something I have come to expect often. I have mentioned in other posts how data loaded in primary key order can populate BTREE indexes in a lopsided manner:

This behavior is seen for MyISAM and InnoDB alike. Don't be too alarmed by this. If the data being copied is in order by an auto-incremented primary key when dumping, the clustered index is populated in a lopsided manner from inception.

When it comes to InnoDB, bulking loading data into a table with an index will have a degree of fragmentation introduced. You just have to bite the bullet on this.

To minimize this, you should do the following for table mytable:

  • Copy the table schema for mytable
  • Create empty mytable on target server
  • Drop indexes for mytable on target server
  • Load data into mytable on target server
  • Create all indexes

MyISAM already has this paradigm in place for non-unique indexes when reloading MyISAM tables

  • ALTER TABLE mytable DISABLE KEYS;
  • Load the data
  • ALTER TABLE mytable ENABLE KEYS;

InnoDB does not :-(

CONCLUSION

Your main concern should be the content of data.

You should run pt-table-checksum for the table for both source and target servers. Even with mismatches here, do not worry too much with VARCHAR columns (See my post Native checksum table says there's a difference, pt-table-checksum does not).

You could also just dump the data in a specific into CSV files on source and target servers and compare the text just to inspect content.

In closing, even though the table is small, once the copy is done and the table is not corrupt, try what you reasonably can to defragment your tables and just live with it.