I am using PostgreSQL to store data on a Person. I need to store whether each Person is capable of teaching a Child of each age year, from 0-17. The ages are discrete values, and a Person can have any number of the 18 (assuming 0-17 years old) assigned to their account. This will be used in a booking system, and also needs to form part of the search results.
Initially I considered creating 18 boolean fields, but this seems inefficient. Is there a better way to do it? I understand that Postgres supports JSONB, so this is an option, but I am unsure of the implications.
Is there a better way to store this data?
Best Answer
OK.
So it's a many-to-many relationship?
If so, you just decompose your data into third normal form as usual, expressing the cardinality by means of one extra relation.
Example follows. The SQL dialect is not necessarily PostgreSQL as I do not have an installation handy right now.
Schema
Data
Logic
Let us get a report on those who can teach any classes
Result:
Let us get a report of who can teach to whom, if anyone
Result:
Who teach class 3?
Result:
Who can teach classes 7 or 9?
Result:
Who can teach large sizes?
Result:
Although you do not mention it in your description, I'm willing to bet there is at least one more important relation on your actual business scenario: pupils. And they probably move from one class to the next. A simple way to represent that relationship would be:
And some logic examples:
Who teaches whom?
Result:
Who are each person's pupils?
Result:
Who are each pupil's teachers?
Result:
From this a couple things should be clear:
age
may be a bit of a misnomer it may not correspond with the actual natural age of pupils attending those classes.It's probably a good idea to avoid the temptation of treating it as a continuous range. This is why I quoted your statement that these are discrete values.
For the rest, unless I misunderstood, seems a fairly straightforward normalisation problem.
By the way:
Where? In the
person
relation?It's not inefficient. It's a red herring.
Normalise your design first. Once you've gained experience with the data you will be handling you can start thinking about introducing optimisations (which may involve denormalisation). Pre-emptive optimisation is just one of those terrible ideas though.