Bulk Update vs Row Wise Update in MySQL – Performance Comparison

MySQLperformancequery-performancereplicationupdate

I have to update multiple rows (> 100k) satisfying a particular criterion in production environment. Since this query takes much time, it comes in slow query. someone told me that instead of making a bulk update, we should send update query row wise.

Example-
old query:

update table_1 
set column_20 = 'abc' 
where updatedOn > date_sub(now(), interval 1day) and column_19 = 'xyz' 

New query/method:

select auto_increment_id_column  
from table_1 
where updatedOn > date_sub(now(), interval 1day) and column_19 = 'xyz' 

Now use a "for loop" for all id's extracted from above, then send below query for each id:

update table_1 
set column_20 = 'abc' 
where auto_increment_id_column = id_from_for_loop;

This method seems to be counter intutive, so, please confirm, if it should be done. Also how it will affect master slave replication.

dbinfo – mysql and using statement based replication

Major Update: Our DBA recently updated the replication to row based

Best Answer

The Update would benefit from the composite INDEX(column_19, updatedOn).

It is almost never faster to use a client loop to do something that can be done in a single SQL statement.

Since you are talking about updating upwards of a million rows, there are other things to discuss...

One should rarely need to update more than a few rows of a table in a batch. Perhaps column_20 does not belong in that table; maybe it belongs in a small, maybe one-row, table to avoid this? Maybe something else. If you care to describe the semantics of the columns, maybe I can give you more concrete advice.

The reason for the big updating seeming to take "too long" is that it is saving things in case you need to ROLLBACK.

But, if you do need to do a massive UPDATE or DELETE, I recommend doing it in chunks. There are notes on the impact on Replication in that document.