Mysql – What would be the fastest approach to bulk update this thesql table (Innodb)

bulkinnodbMySQLoptimization

I have a table with 100M rows. I am able update my table like 850 updates / minute. How to speed this up?

I am using following approach:

mysql -uroot < query_file

where query_file contains 1M update queries.

The queries are like:

update table a set col1=1 where col2="something";

Please note, that the value of col2 is different for each query.

What would be the fastest approach to bulk update this mysql table (Innodb)?

Moreover, can I optimize this query like by col2 as a primary key or unique or index or anything else, you can suggest?

UPDATE:

Table has 8 columns.

Right now, the Primary key is a combination of url_hash(16) and domain name(say col 1 and col 6). Though, I am planning to add auto-increment id column as the primary key (col2 in the above example)

Table has 4 BTREE Type index:

  • Index 1 on say col 1 and col 6 <– primary key
  • Index 2 on say col 1, col 3 and col 4
  • Index 3 on say col 1, col 5
  • Index 4 on say col 4

Best Answer

The first thing you need to determine about col2 is if it can be a PRIMARY KEY.

Run this query

SELECT COUNT(1),col2 FROM table GROUP BY col2 HAVING COUNT(1) > 1;

If nothing comes back, then col2 can be a UNIQUE KEY. If even one row comes back, then col2 cannot be a UNIQUE KEY. You can create an index on it.

Since this query would take a while without an index, just go ahead and index it

ALTER TABLE table ADD INDEX col2 (col2);

but that might take a very long time.

You may need a temp table to accomplish this:

Step 01) Create a table that will hold the one million col2 values

DROP TABLE IF EXISTS col2values;
CREATE TABLE col2values ENGINE=InnoDB SELECT col2 FROM table WHERE 1=2;
ALTER TABLE col2value ADD PRIMARY KEY (col2);

Step 02) You can load the col2values table with the one million col2 values

Step 03) Index col2

CREATE TABLE table2 LIKE table;
ALTER TABLE table2 ADD INDEX col2 (col2);
INSERT INTO table2 SELECT * FROM table;
DROP TABLE table;
ALTER TABLE table2 RENAME table;

Step 04) Perform update using INNER JOIN

UPDATE col2values A
INNER JOIN table B
USING (col2)
SET B.col1=1;

In the above steps

  • Step 03 needs to be done only once
  • Step 03 is done with 5 lines rather than ALTER TABLE table ADD INDEX col2 (col2); because it would require a full rollback operation should anything go wrong.
  • All other steps are your bulk insert steps going forward

Give it a Try !!!