I am trying to understand what are the disadvantages of denormalization. Say that I have the following database (School_has_Student
is a denormalized table):
I have read that the problems that you may face when using denormalization are related to the operations of INSERT
, UPDATE
, and DELETE
.
Now I understand why we have the UPDATE
problem, because if I updated one piece of data, then I would have to update the rest of the related pieces of data (for example: if I updated the student name "Paul" in the Student
table, then I would have to also update the student name "Paul" that exist two times in the School_has_Student
table), but I don't understand why INSERT
and DELETE
are also problems.
Note: I know that the increase of storage space is also a disadvantage, but I do not think it is a big disadvantage with current storage devices which have very large capacity.
Best Answer
There are different kinds of denormalization. Say we can have the main table
readouts
like that:Table stores the readouts from the lot of sensors. If we need the last readout for each sensor we have to run the query like that:
That is quite heavy query even when the table
readouts
is properly indexed. We can denormalize the database by creating the table of the same structure but the UNIQUE constraint forsensorID
:Now we have to
INSERT
incoming readouts to the both tables. The second table should be inserted in the special way:This addtitional
INSERT
is the insert-related disadvantage of the denormalization. But the advantage is the minimal cost of the queryThe
DELETE
disadvantage is of the same sort - you need more operations ofDELETE
but you have some benefits instead.