http://dirtsimple.org/2010/11/simplest-way-to-do-tree-based-queries.html provides an algorithm for inserting and deleting from a Closure Table.
I'd like to model a similar data structure, except that nodes may have multiple parents.
Given:
If we remove [B, C]
I expect to end up with:
and if we remove node B
I expect to end up with:
However, if you use the author's algorithm for removing links or nodes you will notice that it tags [D, C, 1]
for deletion, which is undesirable.
What I've tried so far
I have tried adapting the original data structure by adding a references
column that indicates how many ways there are to travel between two nodes. In the above example, you can travel from A
to C
either through B
or through D
. The idea would have been that when B
gets removed, the path from A
to C
gets kept and the reference count decreases from 2 to 1. It was nice in theory, but I couldn't figure out how to get the implementation working and now I wonder if it's at all possible (the data structure might not contain enough information to figure out which rows to remove).
What I'm asking
How would you adapt Closure Tables to support multiple parents? What alternative data structures would you recommend? https://stackoverflow.com/q/4048151/14731 contains an exaustive list of such data structures, but it's not clear which ones support (or are best for) multiple parents.
Best Answer
Usually create a nodes table and a relationships table. Directed graphs are't really hierarchical and they can have loops which makes querying harder. But if you think of a DAG as being a generalised tree (I.e. a tree that allows multiple parents but is still strictly hierarchical) and a directed graph as being a generalised DAG (i.e. like a DAG but not strictly hierarchical) things get easier.
So for a very simple, PostgreSQL solution we might do something like:
Then you can query something like this: