In SQL databases does an update rewrite all columns in the row or only those in the SET tuple

performancerdbmsupdate

i was wondering how a sql update works in two scenarios:

table with 3 columns: a, b, c

say the data value for a has to be changed.

1: update on a row setting all columns a, b, c (where b and c have not changed from what is stored)

2: update on a row setting only the a column value (because b and c have not changed)

is there any performance difference between these? im having a hard time finding information about how they actually work behind the scenes. from the few articles i found my understanding is that in an update the db will:

  • load the row (all columns) into memory
  • set the values from the set tuple (in order)
  • write the updated row back

Best Answer

This is going to vary between databases, but there are things that we can say are generally (but certainly not always) true of common databases such as SQL Server, Postgres, etc.

Drive IO generally works in blocks, usually these days of 4Kbyte or 0.5K, and to write any of a block the drive will write the whole block. Database engines organise their structures into pages for various reasons including to mach the IO subsystem. In MS SQL Server each page is 8K, done databases allow you to choose on a per DB or per server basis. As with the underlying drives writing to a page will result in rewriting the whole page (which may be multiple disk blocks in size) not just the part constituting the updated row(s) or the individual columns in the row(s). So the same amount of IO will be done in your two examples (SET a & SET a, b, c).

In fact more than just that page will be affected: an entry will be made in the transaction log first, and relevant index pages may need to be updated too. The MVCC model used (or supported as an option) by some database systems as extra details to the question, but that detail may be getting a bit deeper than the intended scope of your question.

Another point that does seem relevant is that this IO will happen even if nothing really changes: if you are seeing column a to 42 and it is already 42 the writes will still happen. Further more, any relevant triggers will still fire, constraints will be checked, etc. This is why you sometimes see statements like UPDATE someTable SET aColumn = @aValue WHERE aColumn <> @aValue, to avoid excess activity (if doing this in your code, be sure to be careful to account for NULLs if they might be present, which my example does not for the sake of being concise, also be aware that there are circumstances where you might want triggers to fire even if an update otherwise effectively does nothing).

It is worth reiterating before finishing this answer, that none of this is true for all databases, or all IO more generally, but is only commonly true.