My colleague and I are having some trouble coming up with a database design, and I am doing my best to stay away from EAV for now.
We have an entity called Preparation
that stores configuration data for machines. There are two types of processes per machine, let's call these process1
and process2
, and a machine can only have one process. Both processes share a lot of attributes so setting up one table seemed logical.
However process2
has a lot of extra data, and even sub-categorizes itself with process2a
, processes2b
that is specific to those sub categories (process1
only has one extra attribute for now). So that one table will have nulls in columns that are not specific to its process.
Splitting the table for each process was another option. However, we need to preserve any data change and splitting the tables will bring more work (most all transactions will be INSERTS
with a timestamp into a history table, followed by an UPDATE
on the live table if they are updating a record). For example, if a machine changes from process1
to process2
, then a record will have to be inserted in the history table from process1
, deleted from process1
and then inserted into process2
. We are using the standard edition of SQL Server, so no CDC to help with logging changes.
I am trying to keep these pretty relational and I would hate to create one table with expected NULLS (maybe create SPARSE
columns if the NULL percentage is high?).
I am hoping someone has experience with this type of situation. Thank you.
Best Answer
Treat the various processes as sub-types. There will then be entity
process_base
, which contains all the common attributes,process1
,process2
,process2a
etc. for the process-specific attributes.Implement these as a table each. A view which combines them all together may simplify usage
This way you minimise the NULL columns (if that is desirable to you) but maintain the unity of "process" as a single idea.