Sql-server – Database Design for Preserving Values and Additional columns

database-designeavsql server

My colleague and I are having some trouble coming up with a database design, and I am doing my best to stay away from EAV for now.

We have an entity called Preparation that stores configuration data for machines. There are two types of processes per machine, let's call these process1 and process2, and a machine can only have one process. Both processes share a lot of attributes so setting up one table seemed logical.

However process2 has a lot of extra data, and even sub-categorizes itself with process2a, processes2b that is specific to those sub categories (process1 only has one extra attribute for now). So that one table will have nulls in columns that are not specific to its process.

Splitting the table for each process was another option. However, we need to preserve any data change and splitting the tables will bring more work (most all transactions will be INSERTS with a timestamp into a history table, followed by an UPDATE on the live table if they are updating a record). For example, if a machine changes from process1 to process2, then a record will have to be inserted in the history table from process1, deleted from process1 and then inserted into process2. We are using the standard edition of SQL Server, so no CDC to help with logging changes.

I am trying to keep these pretty relational and I would hate to create one table with expected NULLS (maybe create SPARSE columns if the NULL percentage is high?).

I am hoping someone has experience with this type of situation. Thank you.

Best Answer

Treat the various processes as sub-types. There will then be entity process_base, which contains all the common attributes, process1, process2, process2a etc. for the process-specific attributes.

Implement these as a table each. A view which combines them all together may simplify usage

create view process_all as
select <whatever>
from process_base
inner join process1 <etc>

union all

select <whatever>
from process_base
inner join process2 <etc>
...

This way you minimise the NULL columns (if that is desirable to you) but maintain the unity of "process" as a single idea.