Professor told us to store serialized Java objects as blobs instead of defining relational tables

database-design

Instead of actually defining a tables with the correct attributes, my professor told us we could map objects to ids like this:

id (int)  |   Serialized Object (blob)
   1               10010110110

I can see so many problems with this; data redundancy, having to track ids separately, having to pull the whole table in to memory to search for anything, and **if I want to change my model in the Java code I will no longer be able to deserialize the blob stored in the database into that model.

Either I am forever stuck with that model or I have to do some other really ugly stuff to change my model.** This whole things seems like bad form to me. Am I justified in disagreeing with my professor? Is there some benefit to doing this that I have not thought of? If I'm correct should I say something to my professor about this? He was preaching this to my whole class and even said that he has built projects that way. A second opinion would be great.

The course is named Software Design.

My professor did not say that this was the best way, but he did say that it was a legitimate alternative to defining relational tables.

The model is not dynamic in any way.

Best Answer

It is not, in itself, a bad thing - at all. Arguing about "which is better" without a proper context (=exact requirements) is an exercise in futility.
The part in bold is wrong. You can easily extend objects already serialized to add new fields and achieve full binary compatibility with the older objects. You can also simply create new classes instead of changing the original ones.

Your discussion with the professor should focus on pros and cons of "relational" versus "key-value store" in different scenarios, not on abstract "betterness". Or you could as well have a discussion on whether Christmas is superior to Thanksgiving.

-- an edit, after reading other answers.

One of the other answers goes as far as to state that that "it's hard to imagine a case where pros outweigh the cons".

Because the whole discussion must be about concrete problems (otherwise we can't even define "better" and "worse"), let me give you one concrete example. It's completely made up, but I tried to flesh out as many details as possible.

Imagine you have an online gaming site, with a database that stores statistics of players in different online games (played in-browser, written in GWT and cross-compiled to javascript). Some of the games are strategic, some are action games, some are platformers. The database is relational and stores players and history of plays and the score.

One day you get an additional requirement: let the players save the game state to the cloud, during the game, so they can restart the game later, at the same point. Needless to say, the only reason to store this temporary state is to return to the game, the state itself will never be introspected.

Now you have two basic choices:

since the games are written in Java, you can quite easily take the model, send it to the server, serialize it in one line of code and store as a blob. The table will be called "saved_games" and it will have foreign keys to the player and so on. From the point of view of the database a "save game" is an opaque, indivisible blob.
you can create a separate relational model for each of your 100 games (this will be tens of tables per game). For pacman alone, for example, you will have to have a table storing positions of all the uneaten pellets, bonuses, positions and current state of ghosts. If someone, someday, modifies the game, even slightly, you will have to update the relational model. Also, for each type of game, you will have to implement a logic to write the Java model to the database, and to read it back.

The answer by Justin Cave says, that you should go with the second option. I think this would be a huge mistake.

Also, I have a hunch that Justin Cave's perception is that what I presented above is an "edge" or "rare" case. I believe that unless he can present some sort of hard data (based on a representative sampling of all the IT projects in the world, not just, say, enterprise applications in the US), I will consider such opinion a classic case of a projection bias.

Actually, the problem of serialized Java objects in a relational database is far deeper than it seems. It touches the very core of the 1NF, namely what is the domain of an attribute?. If you are really interested in the topic, there's a great article by C. J. Date, in his Date on Database: Writings 2000-2006.

Best Answer

Related Solutions

Mysql – Serialized objects vs multiple tables

SQL Server – Best Way to Store Immutable Read-Only Data for Logging

Related Question