Database Design – Structuring Excel-like Data

database-design

The task

I need to store the some values in sequence in the database. The data looks just like a sequence of numbers. Here is an example of the data:

            Val_1   Val_2   Val_3   ...     Val_145
Record 1        6       8       4                 5
Record 2        2       5       6                 3
etc.

Now the thing is that there are a couple of actions that the user has to be able to do:

  • Insert columns (== e.g. the user needs to enter additional data in between Val_1 and Val_2 columns and inserts 5 columns there). The column headers are sequential though (val_2 always follows val_1, val_3 always follows val_2) thus only the values get shifted to the right – just like in Excel when you insert columns.
  • Delete column – same as with insert columns – column header always stay the same and values get deleted and values from the right get shifted to the left – again just like in excel.
  • View random chunks of values. E.g. view values from index 600 to index 1000

The three solutions

I came up with three solutions that can store those values, but every solution has some flaws. Here they are:

Store the index of value in the Value

Tables:

table Record
    id int
table Value
    record_id int
    index int
    value int

Pros: easy to find the needed values by using the index field (where index > 662 and index < 987).
Cons: inserting and deleting value are horrible. If there are 1000 values for a record then inserting one value at index 500 will require 1 insert and 500 updates (to shift values after index 500). Same problem with deleting values.

Store references to previous value and the next value

Tables:

table Record
    id int
table Value
    record_id int
    value int
    next_value_id int
    prev_value_id int

Pros: easy to insert and delete values – inserting a value requires 1 insert and 2 updates (updating refs on adjacent values). We can even remove the prev_value_id to make it even better.
Cons: finding values by index will be horrible. Finding range of values from 500 to 1000 will require us to go though all the values starting at value 1.

Store values as e.g. string in the record table

Tables:

table Record
    id int
    values_str text

Pros: this is just like parsing a text file. Most work is done in code. Feels like a simple and straightforward way to go.
Cons: does not feel right. This may look and feel good, but something about this design smells bad. I have a bad feeling about this.

The question

By know you probably already guessed the question – which way should I go? Is there a more sophisticated way of storing such data? Does any of the solutions I've thought of make any sense?

Best Answer

Here is one approach:

CREATE TABLE columns (
    id       INTEGER, 
    ordinal  INTEGER);

CREATE TABLE rows (
    id       INTEGER, 
    ordinal  INTEGER);

CREATE TABLE cells (
    rowid     INTEGER, 
    columnid  INTEGER, 
    value     TEXT);

This way, you will still need to +1/-1 the ordinals behind the new/deleted position, but fortunately you can do it all with a single statement, UPDATE rows SET ordinal = ordinal + 1 WHERE ordinal > 42. Although it updates many rows, the update statement should execute in less than a second.

Benefits:

  • Updating a rows table with N rows is lighter than updating a cells table with N*M rows.
  • Explicitly storing the ordinal means fast random access to an ordered subset of the data.