MySQL: Overlay modifications on original data while indexing on a field that itself has modifications

data integrityindexMySQLview

I have a MySQL table of original_data that is important for historical/auditing purposes and should not ever be changed. I want to be able to mark up these data with modifications, e.g. to correct errors in the original data entry, and view the overall composite dataset (changes overlaid on original data), without making any modifications to the table of original_data.

My fields are a mix of int, varchar, and datetime; nothing bizarre.

The system currently accomplishes this with the following:

  • A table original_data (which is never modified). This is indexed
    on a datetime field.

  • A table modifications, which includes all fields that
    original_data has. Each row in modifications references the primary
    key of the row in original_data that is modified.

  • A view mods_overlay that joins original_data to modifications on
    a key, displaying modified data (where it exists) in place of original
    data. I'm using CASE WHEN statements to return fields from
    modifications where they exists for a particular row, else return the
    field from original_data.

So far, so good! The problem is, I have millions of rows, and one of the fields I need to modify is the datetime field on which original_data is indexed. As soon as my mods_overlay view overlays this datetime field, I can no longer efficiently select my data on a specific period of time; runtime for select statements increases from a few seconds to 30+ minutes. MySQL does not support indexing for views, nor does it support materialized views, either of which could help here.

One workaround: I could implement mods_overlay as a table that is pre-computed overnight, but then new modifications cannot be seen until the next day, and I'm not sure if this will be acceptable.

Are there other ways to solve this problem without making significant tooling changes?

Stated differently: is there a best practice in MySQL for providing a way to overlay a set of modifications on a set of original data (in a table that is never modified), while indexing on a field that can itself have modifications?

Best Answer

Very few users use VIEWs.

The typical approach is to ALTER the table to make the desired schema changes.

pt-online-schema-change is the tool of choice if ALGORITHM=INPLACE is not available for the desired change.