Postgresql – How to represent discrete ages in a relational database

database-designpostgresqlrelational-theory

I am using PostgreSQL to store data on a Person. I need to store whether each Person is capable of teaching a Child of each age year, from 0-17. The ages are discrete values, and a Person can have any number of the 18 (assuming 0-17 years old) assigned to their account. This will be used in a booking system, and also needs to form part of the search results.

Initially I considered creating 18 boolean fields, but this seems inefficient. Is there a better way to do it? I understand that Postgres supports JSONB, so this is an option, but I am unsure of the implications.

Is there a better way to store this data?

Best Answer

The ages are discrete values

OK.

a Person can have any number of the 18 (assuming 0-17 years old) assigned to their account.

So it's a many-to-many relationship?

If so, you just decompose your data into third normal form as usual, expressing the cardinality by means of one extra relation.

Example follows. The SQL dialect is not necessarily PostgreSQL as I do not have an installation handy right now.

Schema

-- persons relation
CREATE TABLE persons (
    personId INT NOT NULL PRIMARY KEY,
    name TEXT NOT NULL
    -- Any other attributes
);

-- classes relation
CREATE TABLE pupilClasses (
    age INT NOT NULL PRIMARY KEY,
    size ENUM ('small', 'medium', 'large') NOT NULL
);

-- "Tie" relation that expresses the many-to-many cardinality
-- between persons and classes
CREATE TABLE persons_pupilClasses (
    personId INT NOT NULL,
    age INT NOT NULL,
    PRIMARY KEY (personId, age),
    FOREIGN KEY (personId) REFERENCES persons (personId)
        ON UPDATE CASCADE ON DELETE CASCADE,
    FOREIGN KEY (age) REFERENCES pupilClasses (age)
        ON UPDATE CASCADE ON DELETE CASCADE
);

Data

-- Let us populate with some data

-- A few classes
INSERT INTO pupilClasses (age, size) VALUES (0, 'small');
INSERT INTO pupilClasses (age, size) VALUES (1, 'small');
INSERT INTO pupilClasses (age, size) VALUES (2, 'small');
INSERT INTO pupilClasses (age, size) VALUES (3, 'small');
INSERT INTO pupilClasses (age, size) VALUES (4, 'small');
INSERT INTO pupilClasses (age, size) VALUES (5, 'small');
INSERT INTO pupilClasses (age, size) VALUES (6, 'medium');
INSERT INTO pupilClasses (age, size) VALUES (7, 'medium');
INSERT INTO pupilClasses (age, size) VALUES (8, 'medium');
INSERT INTO pupilClasses (age, size) VALUES (9, 'medium');
INSERT INTO pupilClasses (age, size) VALUES (10, 'medium');
INSERT INTO pupilClasses (age, size) VALUES (11, 'medium');
INSERT INTO pupilClasses (age, size) VALUES (12, 'large');
INSERT INTO pupilClasses (age, size) VALUES (13, 'large');
INSERT INTO pupilClasses (age, size) VALUES (14, 'large');
INSERT INTO pupilClasses (age, size) VALUES (15, 'large');
INSERT INTO pupilClasses (age, size) VALUES (16, 'large');
INSERT INTO pupilClasses (age, size) VALUES (17, 'large');


-- A few persons
INSERT INTO persons (personId, name) VALUES (666, 'Alice');
INSERT INTO persons (personId, name) VALUES (667, 'Bertrand');
INSERT INTO persons (personId, name) VALUES (668, 'Carlos');

-- Who can teach to whom

-- Alice to 0−3 and 7−8 classes
INSERT INTO persons_pupilClasses (personId, age) VALUES (666, 0);
INSERT INTO persons_pupilClasses (personId, age) VALUES (666, 1);
INSERT INTO persons_pupilClasses (personId, age) VALUES (666, 2);
INSERT INTO persons_pupilClasses (personId, age) VALUES (666, 3);
INSERT INTO persons_pupilClasses (personId, age) VALUES (666, 7);
INSERT INTO persons_pupilClasses (personId, age) VALUES (666, 8);

-- Bertrand to 7−9, 13, 16−17 classes
INSERT INTO persons_pupilClasses (personId, age) VALUES (667, 7);
INSERT INTO persons_pupilClasses (personId, age) VALUES (667, 8);
INSERT INTO persons_pupilClasses (personId, age) VALUES (667, 9);
INSERT INTO persons_pupilClasses (personId, age) VALUES (667, 13);
INSERT INTO persons_pupilClasses (personId, age) VALUES (667, 16);
INSERT INTO persons_pupilClasses (personId, age) VALUES (667, 17);

-- Carlos can't teach anyone yet ☹

Logic

Let us get a report on those who can teach any classes

SELECT persons.name, GROUP_CONCAT(pupilClasses.age) classes
FROM persons
    INNER JOIN persons_pupilClasses
        ON persons.personId = persons_pupilClasses.personId
    INNER JOIN pupilClasses
        ON pupilClasses.age = persons_pupilClasses.age
GROUP BY persons.personId;

Result:

+----------+----------------+
| name     | classes        |
+----------+----------------+
| Alice    | 0,1,2,3,7,8    |
| Bertrand | 7,8,9,13,16,17 |
+----------+----------------+

Let us get a report of who can teach to whom, if anyone

SELECT persons.name, GROUP_CONCAT(pupilClasses.age) classes
FROM persons
    LEFT JOIN persons_pupilClasses
        ON persons.personId = persons_pupilClasses.personId
    LEFT JOIN pupilClasses
        ON pupilClasses.age = persons_pupilClasses.age
GROUP BY persons.personId;

Result:

+----------+----------------+
| name     | classes        |
+----------+----------------+
| Alice    | 0,1,2,3,7,8    |
| Bertrand | 7,8,9,13,16,17 |
| Carlos   | NULL           |
+----------+----------------+

Who teach class 3?

SELECT persons.name
FROM persons
    INNER JOIN persons_pupilClasses
        ON persons.personId = persons_pupilClasses.personId 
WHERE persons_pupilClasses.age = 3;

Result:

+-------+
| name  |
+-------+
| Alice |
+-------+

Who can teach classes 7 or 9?

SELECT DISTINCT persons.name
FROM persons
    INNER JOIN persons_pupilClasses
        ON persons.personId = persons_pupilClasses.personId 
WHERE persons_pupilClasses.age IN (7,9);

Result:

+----------+
| name     |
+----------+
| Alice    |
| Bertrand |
+----------+

Who can teach large sizes?

SELECT DISTINCT persons.name
FROM persons
    INNER JOIN persons_pupilClasses
        ON persons.personId = persons_pupilClasses.personId 
    INNER JOIN pupilClasses
        ON pupilClasses.age = persons_pupilClasses.age
WHERE pupilClasses.size = 'large';

Result:

+----------+
| name     |
+----------+
| Bertrand |
+----------+

Although you do not mention it in your description, I'm willing to bet there is at least one more important relation on your actual business scenario: pupils. And they probably move from one class to the next. A simple way to represent that relationship would be:

-- pupils relation
CREATE TABLE pupils (
    pupilId INT NOT NULL PRIMARY KEY,
    name TEXT NOT NULL,
    age INT NOT NULL,
    FOREIGN KEY (age) REFERENCES pupilClasses (age)
        ON UPDATE RESTRICT ON DELETE RESTRICT
);

-- A few pupils
INSERT INTO pupils (pupilId, name, age) VALUES (1313, 'Donald', 7);
INSERT INTO pupils (pupilId, name, age) VALUES (1314, 'Ernest', 7);
INSERT INTO pupils (pupilId, name, age) VALUES (1315, 'Frank', 9);
INSERT INTO pupils (pupilId, name, age) VALUES (1316, 'Gertrude', 0);
INSERT INTO pupils (pupilId, name, age) VALUES (1321, 'Hans', 13);

And some logic examples:

Who teaches whom?

SELECT persons.name teacher, pupils.name pupil
FROM persons
    INNER JOIN persons_pupilClasses
        ON persons.personId = persons_pupilClasses.personId 
    INNER JOIN pupils
        ON pupils.age = persons_pupilClasses.age;

Result:

+----------+----------+
| teacher  | pupil    |
+----------+----------+
| Alice    | Donald   |
| Bertrand | Donald   |
| Alice    | Ernest   |
| Bertrand | Ernest   |
| Bertrand | Frank    |
| Alice    | Gertrude |
| Bertrand | Hans     |
+----------+----------+

Who are each person's pupils?

SELECT persons.name teacher, GROUP_CONCAT(pupils.name) pupils
FROM persons
    INNER JOIN persons_pupilClasses
        ON persons.personId = persons_pupilClasses.personId 
    INNER JOIN pupils
        ON pupils.age = persons_pupilClasses.age
GROUP BY persons.personId;

Result:

+----------+--------------------------+
| teacher  | pupils                   |
+----------+--------------------------+
| Alice    | Donald,Ernest,Gertrude   |
| Bertrand | Donald,Ernest,Frank,Hans |
+----------+--------------------------+

Who are each pupil's teachers?

SELECT pupils.name pupil, GROUP_CONCAT(persons.name) teachers
FROM persons
    INNER JOIN persons_pupilClasses
        ON persons.personId = persons_pupilClasses.personId 
    INNER JOIN pupils
        ON pupils.age = persons_pupilClasses.age
GROUP BY pupils.pupilId;

Result:

+----------+----------------+
| pupil    | teachers       |
+----------+----------------+
| Donald   | Alice,Bertrand |
| Ernest   | Alice,Bertrand |
| Frank    | Bertrand       |
| Gertrude | Alice          |
| Hans     | Bertrand       |
+----------+----------------+

From this a couple things should be clear:

  • age may be a bit of a misnomer it may not correspond with the actual natural age of pupils attending those classes.

  • It's probably a good idea to avoid the temptation of treating it as a continuous range. This is why I quoted your statement that these are discrete values.

For the rest, unless I misunderstood, seems a fairly straightforward normalisation problem.

By the way:

Initially I considered creating 18 boolean fields,

Where? In the person relation?

but this seems inefficient.

It's not inefficient. It's a red herring.

Normalise your design first. Once you've gained experience with the data you will be handling you can start thinking about introducing optimisations (which may involve denormalisation). Pre-emptive optimisation is just one of those terrible ideas though.