Mysql – TokuDB not much faster than MySQL

innodbMySQLtokudb

I have converted a MySQL database with 80.000.000 rows to TokuDB.

Now when I run:

 select count(id) from xxx where active=1

it takes 90% of the time of the normal MySQL request.

What do I have to further optimize so that it runs faster?


The table definition:

CREATE TABLE `adsDelivered` (
  `id` bigint(20) NOT NULL AUTO_INCREMENT,
  `uid` varchar(40) NOT NULL,
  `_adsDelivered` bigint(20) NOT NULL DEFAULT '0',
  `_campaign` bigint(20) NOT NULL DEFAULT '0',
  `_ad` bigint(20) NOT NULL DEFAULT '0',
  `session` varchar(44) NOT NULL,
  `referer` text NOT NULL,
  `refererDomain` varchar(256) NOT NULL,
  `pageTime` int(11) NOT NULL DEFAULT '0',
  `pageVisibleTime` int(11) NOT NULL DEFAULT '0',
  `browser` varchar(256) NOT NULL,
  `ip` varchar(15) NOT NULL,
  `clicks` int(11) NOT NULL DEFAULT '0',
  `clickTimeLast` timestamp NOT NULL DEFAULT '0000-00-00 00:00:00',
  `tag` varchar(256) NOT NULL,
  `countryShort` varchar(2) NOT NULL,
  `timeCreated` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
  `timeUpdated` timestamp NOT NULL DEFAULT '0000-00-00 00:00:00',

  PRIMARY KEY (`id`),
  UNIQUE KEY `uid` (`uid`),
  KEY `_campaign` (`_campaign`),
  KEY `_ad` (`_ad`),
  KEY `_adsDelivered` (`_adsDelivered`),
  KEY `session` (`session`),
  KEY `tag` (`tag`),
  KEY `ip` (`ip`),
  KEY `countryShort` (`countryShort`),
  KEY `refererDomain` (`refererDomain`)
) ENGINE=TokuDB AUTO_INCREMENT=7420143 DEFAULT CHARSET=utf8;

Best Answer

I work for Tokutek. The answers here are mostly good. As Justin mentions, you need the right index, and your schema probably does not have the right index. I am happy to hear TokuDB was a little faster than InnoDB, but for table scans, assuming the table is not aged, it can go either way.

Here is a talk I gave on indexing that you may find helpful: http://www.youtube.com/watch?v=vaGAoK66ctM.

The first half is on indexing, the second half is a bit of a technical dive into fractal trees on a whiteboard. Hopefully this helps you with index design. I highly recommend understanding clustering secondary indexes, which TokuDB provides.

On other points. RolandoMySQLDBA is mostly correct on how InnoDB and TokuDB perform. Here is how to think about the performance of TokuDB. While the dataset fits in memory, TokuDB's fractal trees do not have any inherent advantage over InnoDB or other B-Tree based storage engines. The bottleneck, or rather the cliff, hits when data is big and does not fit in main memory. Where InnoDB's write performance, and write performance of other B-tree based storage engines, fall off a cliff, TokuDB's performance stays level. THAT is the indicator that you are getting something out of TokuDB. TokuDB will not take some well running existing system and supercharge its performance. TokuDB will take systems work well in memory but start to break down when falling out of memory, and ensures those systems perform well as data grows. That is what is going on with benchmarks that Percona shows, that is what is going on with iiBench (http://www.tokutek.com/resources/benchmarks/#iiBench).

Combine this write performance with TokuDB's compression, and suddenly clustering indexes, as explained in the indexing talk, become relatively cheaper. Maintaining more better indexes becomes cheaper. Use better indexes, a lot of I/O from queries may dissappear, and query throughput improves. This is how one can benefit from TokuDB.