Writing to a DKNF database, but reading from a de-normalized table for performance

database-designnormalizationperformance

Would the following approach be reasonable?

The Goal: queries have to be relatively fast, but without the hassles incurred by writing to a de-normalized database.

The Context: A database for handling user profiles for some CMS. User profiles sometimes change, but are not updated too regularly; so for the most part the information for each user remains the same for some time.

The Database Design:

  • First, we put together a highly normalized database, call it "N". User profiles are stored in N.

  • Being highly normalized, N is secure and elegant to update, but slow to query. So we take all the data stored in N and we create a single de-normalized table. We call this table T.

  • T is used for the sole purpose of retrieving user information. It is only read from, not written to, because it's prone to anomalies and would be a mess to update.

With one exception:

  • To table T, we add a boolean column, "hasChanged", default value "FALSE".

The CMS's algorithm for handling queries:

Suppose we want to load user 47's profile.

  1. If hasChanged == FALSE for a row with ID == 47, we can safely retrieve the user profile from this row. (Because we know nothing has changed.)

  2. If hasChanged == TRUE, then row 47 in T cannot be trusted (something has changed), so the query is passed on to the reliable but slow N.

  3. When a user updates his profile, N is updated. When N is updated, hasChanged is set to TRUE on all rows in T containing information that could be affected by this change.

  4. T is completely rewritten by the system whenever it becomes too unreliable (this is determined by how frequently queries are passed on to N).

Best Answer

Use the full normalized database, scan which queries need more performance than this model can give them and create materialized views on them. If the data is very volatile, having them fast refreshing introduces some overhead but you know the value and validity of your data, that is protected by the database. Oracle handles this very nice.

You queries remain using the normalized tables and Query Rewrite will take care of the usage of the Materialized Views, when appropriate. When the Materialized Views are not fresh, the queries fall back to the original tables.

For docu see Materialized View Concepts and Architecture