Database Design for updatable sequential records

database-designrelational-theory

I am working on creating a data model for storing data related to production tracking. I work for an engineering firm that models and analyzes data for our clients. There are several steps in the process and the process is constantly updating.

I am trying to model the processes and include parent processes and sequential order of processes.

For example:

Process Table
---------------------
ProcessID - uniqueidentifier
ProcessName - varchar
ProcessDescription - varchar
...

ProcessOrder Table
---------------------
ProcessID - uniqueidentifier FK - Process
ParentProcessID - uniqueidentifier FK - Process
ProcessOrder - int
...

The ProcessOrder column in the ProcessOrder Table would simply store a number representing which sequential step in the parent process it represents.

For example, a modeling procedure has the following steps: create new empty model, name model, enter model parameters. The Process Table would look like:

ProcessID | ProcessName | ProcessDescription
-------------------------------------------------
UUID1     | Modeling    | Create Model of Data
UUID2     | New Model   | create new empty model
UUID3     | Name Model  | name model
UUID4     | Parameters  | enter model parameters

The ProcessOrder Table would look like:

ProcessID | ParentProcessID | ProcessOrder
--------------------------------------------------
UUID2     | UUID1           | 1
UUID3     | UUID1           | 2
UUID4     | UUID1           | 3

The issue with this design is that when the workflow gets updated, the process order will change and I will need to update the ProcessOrder record for the process that changed and for all subsequent records with the same ParentProcessID.

Is there a better way to store this kind of data and maintain normalization?

Best Answer

Your design seems reasonable to me. While you do have to update all subsequent records when new processes are added or deleted that is easy to accomplish. You just issue an update like:

UPDATE ProcessOrder
SET ProcessOrder = ProcessOrder+1
WHERE ProcessOrder >= [step# where you want to insert]

and then do your insert or delete.

The only other way I can think of would be to design the schema to store the next process id on the row. Something like:

ProcessID | ParentProcessID | NextId
--------------------------------------------------
UUID2     | UUID1           | UUID3
UUID3     | UUID1           | UUID4
UUID4     | UUID1           | NULL

Then if you insert a new step - say between UUID3 and UUID4, you perform more of a linked list operation which will update UUID3|UUID1's NextId to UUID5 and then just insert the new UUID5 with a NextId of UUID4.

This will reduce the UPDATEs to 1 in most cases, but it will make querying the process more difficult as now you have to walk the list from top to bottom to list out step by step.

You need to decide which process you want to favor - inserting and updating or retrieving. If you favor retrieval (which you might if changes are infrequent and reporting is frequent, and the lists are short), then go with your original design. If you favor insert and update (which you might if changes are happening all the time and reporting is infrequent, or lists are really really long), then go with the linked list approach.

I hope this helps. Interested in what other solutions the community might come up with as I'd love to broaden my knowledge around this!

Related Question