Not sure if this is the right stack exchange for asking this question. If I should ask this elsewhere let me know.
Let's pretend that I have a system in which at the moment there are 3 resources that have a prent/child relationship, but it is anticipated the hierarchy might grow into 10 resources but not more:
SuperGroup
\
+-- Group
\
+ SubGroup
For the sake of simplicity I have only kept the name field here. The parent child relationship is modelled using foreign-keys:
create table supergroup (
id serial primary key,
name text
)
create table group (
id serial primary key,
super_group_id int not null
foreign key references super_group(id),
name text
)
create table subgroup (
id serial primary key,
group_id int not null
foreign key references group(id),
name text
)
For each group, we want to track monthly status of a property/asset they all share where the property has multiple fields.
Let's call this property specific_property
and let's assume it has fields val1, val2, val3
and val4
.
At the moment I have modelled this association like this:
create table specific_property (
id serial primary key,
val1 text,
val2 int,
val3 float
val4 boolean
)
create table supergroup_specific_properties(
supergroup_id int foreign key references supergroup(id) not null,
specific_property_id int foreign key references specific_property(id) not null,
primary key (supergroup_id, specific_property_id)
)
create table group_specific_properties(
group_id int foreign key references group(id) not null,
specific_property_id int foreign key references specific_property(id) not null,
primary key (group_id, specific_property_id)
)
create table subgroup_specific_properties(
subgroup_id int foreign key references supergroup(id) not null,
specific_property_id int foreign key references specific_property(id) not null,
primary key (subgroup_id, specific_property_id)
)
One use case that I have is returning all specific_properties for a super group and all of its descendants:
select * from supergroup super
inner join group group on (group.supergroup_id = super.id)
inner join subgroup sub on (sub.subgroup_id = group.id)
inner join supergroup_specific_properties supersp on (supersp.supergroup_id = super.id)
inner join group_specific_properties gsp on (gsp.supergroup_id = group.id)
inner join subgroup_specific_properties subsp on (subsp.supergroup_id = sub.id)
inner join specific_property sp1 on (sp1.id = supersp.specific_property_id)
inner join specific_property sp2 on (sp2.id = gsp.specific_property_id)
inner join specific_property sp3 on (sp3.id = subsp.specific_property_id)
The nice thing about this model, if I am not missing anything here, is that it is normalized (3NF I suppose). The ugly thing here is that I have to create one join table per group type and search queries become join heavy and a bit complex.
An alternative model could have been:
create table specific_property (
id serial primary key,
val1 text,
val2 int,
val3 float
val4 boolean,
group_type enum('supergroup', 'group', 'subgroup'),
group_id int
)
The nice thing about this is that it is very simple and easy to use. The problem is I don't have referential integrity for group_id
.
A second alternative could look like this:
create table supergroup_specific_properties(
id serial primary key,
supergroup_id int foreign key references supergroup(id) not null,
val1 text,
val2 int,
val3 float
val4 boolean
)
create table group_specific_properties(
id serial primary key,
supergroup_id int foreign key references group(id) not null,
val1 text,
val2 int,
val3 float
val4 boolean
)
create table subgroup_specific_properties(
id serial primary key,
supergroup_id int foreign key references subroup(id) not null,
val1 text,
val2 int,
val3 float
val4 boolean
)
but this one has a lot of duplication and more difficult to evolve and maintain.
Is it possible to objectively choose one of them (or design an alternative) that:
- Is easy to use
- Protects data integrity and avoids anomalies
- Is performant for my given use case and does not become difficult for the planner to optimize when the hierarchy grows
Best Answer
Some food for thought:
Depending on the relationship between properties and group you would need one or two tables (can a group have more than one property?).
or
On the other hand, if different types of groups only differ in hierarchical position you can cope with just one table and a level attribute: