Mysql – What’s the most efficient way to batch UPDATE queries in MySQL

MySQLupdate

I'm writing an application that needs to flush out a large number of updates to the database for an extended period of time, and I've gotten stuck at how to optimize the query. Currently I'm using INSERT INTO ... VALUES (..), (..) ON DUPLICATE KEY UPDATE, which works to batch all of the values into one query, but executes excruciatingly slowly on large tables. I don't ever actually need to insert rows.

Other approaches I've seen are to update using SET value = CASE WHEN... (which would be hard to generate due to the way I'm building the queries, and I'm not sure about the performance of CASE for hundreds/thousands of keys), and simply multiple concatenated updates. Would either of these be faster than my current method?

It baffles me that, as far as I can tell, there's no idiomatic, efficient way to do this in MySQL. If there really isn't a way that's faster than ON DUPLICATE KEY, would it be worth it to switch to PostgreSQL and use its UPDATE FROM syntax?

Any other suggestions are also greatly appreciated!

Edit: here's one of the tables that gets updated frequently. I've removed column names due to them being irrelevant.

CREATE TABLE IF NOT EXISTS `table` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `a` bigint(20) unsigned NOT NULL DEFAULT '0',
  `b` bigint(20) unsigned NOT NULL DEFAULT '0',
  `c` enum('0','1','2') NOT NULL DEFAULT '0',
  `d` char(32) NOT NULL,
  -- trimmed --
  PRIMARY KEY (`id`),
  KEY `a` (`a`),
  KEY `b` (`b`),
  KEY `c` (`c`),
  KEY `d` (`d`)
) ENGINE=InnoDB  DEFAULT CHARSET=utf8;

Best Answer

Since you're using InnoDB tables, the most obvious optimization would be to group multiple UPDATEs into a transaction.

With InnoDB, being a transactional engine, you pay not just for the UPDATE itself, but also for all the transactional overhead: managing the transaction buffer, transaction log, flushing the log to disk.

If you are logically comfortable with the idea, try and group 100-1000 UPDATEs at a time, each time wrapped like this:

START TRANSACTION;
UPDATE ...
UPDATE ...
UPDATE ...
UPDATE ...
COMMIT;

Possible downsides:

  • One error will collapse the entire transaction (but would be easily fixed in code)
  • You might wait for a long time to accumulate your 1000 UPDATEs, so you might also want to have some timeout
  • More complexity on your application code.