Database Design – How to Normalize a Very Small Database

normalization

I am designing a very small database (c. 250 rows). I have a number of datapoints for each record, which under normalization rules, ought to be in separate tables.

For instance:

  1. Percentages of a number of different categories (which should add up to 100)
  2. Multiple sources of data (for instance a record may come from up to 4 sources of data – the reference within these data sources should be recorded
  3. Columns with very different units for instance I want to store residential units and Gross Floor area delivered (different units).

All these fields are optional, none are mandatory.

I am considering not normalizing in case 1, because I want a data integrity check that all the relevant columns in each database row add up to 100%.

For the other two cases, it seems that it is a lot of work to normalise for not a lot of gain. For 3, I would have to design 2 additional tables, one to store the data:

| ID(prim key) | project(FK)          | type(int) | value(float) |
|--------------|----------------------|-----------|--------------|
| 1            |  1                   | 1         | 200.3        |

and another to store the column metadata:

| ID(prim key) | typeID(FK)  | Description |
|--------------|-------------|-------------|
| 1            | 1           |    GFA(m2)  |

rather than just do this in the main table:

| RecordID(prim key) | GFA(m2) | Residential units | .... |
|--------------------|---------|-------------------|------|
| 1                  | 200     | 25                | .... |

With such small data, my main concern is updating things if the schema changes, but I can't think of a reason that will be harder with the de-normalised data.#

Can anyone give me a good reason to normalise for cases 2 and 3?

Thanks

Best Answer

For a small database with just a few rows, you might say that the design is whatever you can maintain and sustain. However, a few things to think about include:

  1. Remember that the first implementation is not the future implementation. Planning ahead just a little can save headaches in the future.
  2. If the database should scale up in the future you might find yourself refactoring tables and code.
  3. Since in Table3 the columns are named GFA(m2), Residential units, et cetera for whatever you are measuring you, you are committing to changing that table repeatedly as new measures are added.

By the way, I note that your values(float) of 200.3 becomes 200 in the final table. (Maybe just a typo.) But be sure to keep the data types of the columns defined to fit the data you are storing.

And, yes, if I was delivering this to someone I would normalize it further.