Mysql – Performance of updating row if content has changed

MySQLperformancequery-performance

I'm trying to make a website where there is a form that contains input type fields that are editable. In fact I get the elements from one of my tables (containing an id column and a string column) from my database, which I display via input with id as the id in this database and string as the string in this database. The user can edit the text in these fields and also there is the possibility to add a new 'row' (input with string, but which refers to no row in the database).

Finally, there is a button to save.

My question is do I directly modify the existing lines even if the user has not changed the string or should I check if the string has been updated? Which method is best in terms of performance?

Best Answer

If your business logic layer(s) know that there has been no change (i.e. it has already re-read the data and can compare it to the provided input then you can simply not send anything to the DB.

If you don't know before hitting the data layer if there has been any change or not then you can do something like:

UPDATE yourTable
SET    field1 = @input1
  ,    field2 = @input2
       ...   ...   ...
  ,    fieldN = @inputN
WHERE  idField = @ID
AND    (
           field1 <> @field1
        OR field2 <> @field2
           ...   ...   ...
        OR fieldN <> @fieldN
       )

This avoids any update if nothing has changed so it more efficient (the lookups needed to reject the update would be needed to perform it anyway and the extra CPU time for the comparison is vanishingly small unless you have huge data in any of the columns), but it is more work to maintain and more prone to errors so most of the time I would recommend keeping it simple and updating without checking every column, i.e. the standard:

UPDATE yourTable
SET    field1 = @input1
  ,    field2 = @input2
       ...   ...   ...
  ,    fieldN = @inputN
WHERE  idField = @ID
-- note no comparison, just update

The maintainability issue is even worse for NULLable columns as fieldN <> @fieldN has to be changed to (`fieldN <> @fieldN OR (fieldN IS NULL AND @fieldN IS NOT NULL) OR (fieldN IS NOT NULL AND @fieldN IS NULL) because comparisons between NULLs are always false (NULL is not equal to NULL, but it is also not not equal to NULL - NULL by definition is unknown so all comparisons (=, !=, <, >, ...) return false unless your DB supports non-ansi behaviour like https://stackoverflow.com/questions/9766717/). If this is all baked into you data access layer then fine, some ORMs for instance provide such optimisation as an option, but you don't want to be manually writing and maintaining it for every update statement in your application.

The DB (ever engine I'm aware of) will perform a full write of the relevant data pages even if no values have changed - they won't spend the few extra CPU cycles checking. Away from the efficiency of the update itself you need to consider knock-on effects and certain business requirements. If there are triggers that take action on updates to the table they will fire even if nothing has changed, this could be a significant consideration if you have any complex triggers to consider. From a business logic point of view do you need to log that the user actively saved the record with no changes? If you are relying on automated change capture of some sort (such as your own triggers, temporal table features in postgres and the upcoming MSSQL2016) to track changes then you (or your end users) need to decide if the data model should care that a no-change update happened or not. Of course if the record includes a "last saved" column this all becomes a moot point - there is always something to update upon save if only that one column.