Your design has a big disadvantage: Getting just the current values (which you will work with most of the time) is an expensive operation that requires a join.
Your system would be much faster if you have a table with the current data and an extra table for historical information, consisting of all columns of users
and a valid_until
timestamp column, with a primary key over (user_id
,valid_until
).
Option 1 is a no; this level of un-normalized data will mean many extra data pages occupying more space on disk, more data pages having to be read/written on operations, ...
Between Option 2 and Option 3 I'd go with something closer to Option 3 - In here you will save many bytes related to users (an Int
goes for 4 bytes instead of the 30/50 of the Chars
). This will save you again considerable disk space (considering users appear more than once) and get you smaller indexes which will mean better performance.
Now, on Option 3, I might be missing something for I don't understand what entityName and entityId stand for. If this is user information, it will better stay on the virtual_user
table so you won't get this info repeated in every meta. Also, on the meta
table I'd keep the orderId
, it will allow you proper and fast joining of the meta
and order
tables, providing adequate indexing is present. I don't know the specifics of your requirements but consider if the updated
column is necessary on the Order
table or if for displaying the desired info on the application you'll be joining with the latest registry from the meta
table all of the time.
If you want to keep track of order history, depending on requirements you might want to consider having a separate table (orderHistory
or 'orderChanges'), or partitioning the order
table. Now, especially for this last case, the order
table could have a composite key composed by (orderId, metaId)
, making it Unique and a candidate index for joining with the meta
table.
-- After the explaining comment:
The meta
table will be the most heavily used one, so keep everything as short as possible on this one:
Clustered index on metaId
Add a table entityTypes
which relates entityName CHAR
to entityType int
. This will also get you a nice list of entities for anyone new looking at the database to know which entities relate to meta
Try having entityId
as int
for all entities (If you can't change/add column to the existing tables, add an intermediary table) - This will reduce the field to 4bytes and allow one single data type for all entities.
Add an index to meta
table on entityType
(now int) and entityId
(now int). Your rows on this heavily used table are now so short you should be doing everything lightning fast.
Best Answer
The answer depends on what you intend to do with the data. Are you going to be doing frequent queries against it or do you only need it for the occasional lookup of what happened when, to rollback bad changes, or for regulatory reasons?
In the first case above, use the second process you descibed although I would add a column to denote the active record. I would also add a record created date. I wouild also create a view of just the active records and use that for allof my code that needs to see the current data.
In the second case above, I would have audit tables that are populated through triggers (the offective way to create audit records for any change to the db) and that also include the data of the change and the user or application that made the change as well as the old and new values.