I'm desigining a database for a university testing tool – running tests in the cloud, very similar to a CI system.
It needs to store users, courses, enrolments, assessments, submissions & results.
It's not a system powering the entire institution, but it is important for a few different courses. I'd like the database to be well-structured, extensible, and "intuitive" for future maintainers.
Requirements:
- Two distinct courses may have the same name (but be run in different years/periods)
- A single course can only be run once per period (of which there are multiple per year)
- Assessment names must be unique within a course, but not across other courses.
- A user may be enrolled in a course as an instructor, tutor, or student, but not multiple at once. A student may become a tutor for the same course in the future.
I've included my schema below, with relevant comments where I feel they're needed.
CREATE TYPE PERIOD AS ENUM ('summer', 'first', 'autumn', 'winter', 'second', 'spring');
CREATE TYPE ASSESSMENT_TYPE AS ENUM ('assignment', 'lab');
CREATE TYPE TEST_RESULT AS ENUM ('warnings', 'errors', 'perfect');
CREATE TYPE ROLE AS ENUM ('student', 'tutor', 'convenor', 'admin');
CREATE TABLE users (
uid VARCHAR(20) PRIMARY KEY,
email VARCHAR(255) UNIQUE NOT NULL,
password BYTEA NOT NULL
);
CREATE TABLE courses (
-- id is used instead of (code, period, year) to
-- uniquely identify a course, since its easier.
id SERIAL PRIMARY KEY,
-- code is used within the university to identify a course,
-- but not when it was run (a single course can be run
-- multiple times, with different students/assessments/etc)
code VARCHAR(20) NOT NULL,
name VARCHAR(255) NOT NULL,
period PERIOD NOT NULL,
year INTEGER NOT NULL,
-- ensure that a single course can only be run once per period.
CONSTRAINT once_per_period UNIQUE (code, period, year)
);
CREATE TABLE assessment (
id SERIAL PRIMARY KEY,
name VARCHAR(255) NOT NULL,
type ASSESSMENT_TYPE NOT NULL,
comments TEXT,
course_id INTEGER REFERENCES courses(id),
-- assessment names must be unique within a course,
-- but not across courses.
CONSTRAINT unique_name_per_course UNIQUE (name, course_id)
);
CREATE TABLE test_results (
id SERIAL PRIMARY KEY,
result TEST_RESULT NOT NULL,
warnings TEXT NOT NULL,
errors TEXT NOT NULL
);
CREATE TABLE submissions (
id SERIAL PRIMARY KEY,
title TEXT NOT NULL,
description TEXT NOT NULL,
feedback TEXT NOT NULL,
uid VARCHAR(20) REFERENCES users(uid),
assessment_id INTEGER REFERENCES assessment(id),
result_id INTEGER REFERENCES test_results(id)
);
CREATE TABLE enrol (
user_uid VARCHAR(20) REFERENCES users(uid),
course_id INTEGER REFERENCES courses(id),
role ROLE NOT NULL,
-- ensure a user cannot have multiple roles for a single course
CONSTRAINT one_role_per_course_run UNIQUE (user_uid, course_id)
);
I'd like some feedback on my design – what can I do better? Is there anything I missed? Any common/idiomatic design choices?
I was going to post in the code review stackexchange, but I saw a previous answer that mentioned this as a better location.
Cheers
Best Answer
Just a few thoughts.
I would not use
ENUM
s. I hate to say "always" (open to correction on this) but I would "always" use aFOREIGN KEY
. Consider gender - 25 years ago, you could have had anENUM
('M', 'F') - now you can have:Unknown
Intersex
Non-binary
Decline-to-answer
Don't know
Don't care
:-)&c., &c...
With a
FOREIGN KEY
things are far easier to change into the future. Furthermore, you make the point that you want yourdatabase to be well-structured, extensible, and "intuitive" for future maintainers(*)
-FOREIGN KEY
s will do this by making your system portable to another RDBMS! After an FK, I'd go for aCHECK CONSTRAINT
!(*)
upvoting for this alone! If only all system designers were so thoughtful - some of my sleepless nights as a programmer might have been avoided! :-)Check out this thread on SO (both question and accepted answer from the same poster). It tries to go through all options (but misses one - see below!) and the "consensus" (if such a thing can be said to exist on these sites) is,
FK
s, followed byCC
s. Check out this - there's a reason that poster has a rep > 300,000 (not to mention a shedload here too)!Gotchas. MySQL (amazingly!) doesn't support
CK
s! Plus, another factor mitigating againstCC
s is that, frequently, DBAs and/or sysadmins are the only operators allowed to change system tables (which is what modifyingCC
s involves) whereas ordinary programmers/users can change ordinary tables.A corollary to this is that if you want tight control over disk activity, you might want the smallest possible amount of it on your system tables - every operation involving a
CC
involves querying them, so the overhead of querying theFK
tables is not such a big deal after all.A factor in favour of
CC
s overENUM
s is that you can use theLIKE
and other operators on them - you can't with the latter. Finally, take a look at the horrible syntax here to compare values inENUM
s to other values (remember they're different types!). And if you're doing lookups of values for, say, dropdowns, you'd have to use something like this:as opposed to:
All in all, I would strongly urge you to use
FK
s orCC
s to enforce Declarative Referential Integrity (DRI) on your tables.There are (mentioned only for completeness) ways of restricting values in tables - you could use
DOMAIN
s orTRIGGER
s. The former seem superior toENUM
s, whereasTRIGGER
s (IMHO) should only be used for complex business logic!Finally, to answer the question (phew...).
Your schema seems fine - it's pretty difficult to judge these things by eye. What I would do if I were you, is to start using/developing against it and see if you come up against problems. Get back to us here with specific issues if you do - your question is a bit broad! p.s. welcome to the forum!