Mysql – Db design for a quiz with variable number of answer choices, and each worth different score in a different category

database-designMySQLschema

I'm working on a question/answer project and having a hard time coming up with a suitable database structure.

Each question can have from 1 – "n" answers, and each answer is associated with one of 6 predefined categories.

For example:

Q1) Which magazine would you prefer to read?

Science (+1 for Science category)
Wired (+1 for Science, +1 for Tech)
Vogue (+1 for Fashion)
Time (+1 for Information, +1 for Social)
Fortune (+1 for Business)
People (+1 for Social)

Q2) Have you taken an art class?

Yes (+1 for Artistic)
No (+0 for Artistic)

Q3) What's the best part of a movie?

Action (+1 for Science)
Story (+1 for Social)
Music (+1 for Artistic)
Acting (+1 for Artistic)

I know I should have one table for Questions, and a separate table for Answers linked by question_ID, but I'm not sure how to deal with a variable number of answer choices that are different for each question, and their corresponding categories.

Best Answer

Consider a data model like this:

ERD

You have a list of quesitons in the QUESTION table. Each question can have two or more possible options for answers in the OPTION table. For each option there are one or more category score changes which are noted in OPTION_CATEGORY. When a USER takes the quiz their answers (i.e. choices of options) are recorded in ANSWER.

Note that with this design you can change someone's score in multiple categories with a single answer (such as in your example of reading WIRED magazine) and you also have the option of changing someone's score in a category by an increment other than 1 (as in your example of not having taken an art class). You could even change someones score by more than 1, for example in your question 1 you could score +10 in the science category for prefering to read Philosophical Transactions of the Royal Society!

Related Solutions

Mysql – Model with variable number of “properties” of different types.

hstore is ideal for this job, and isn't in any way experimental; it's been a part of PostgreSQL since 8.3 and is way easier to use in 9.1 and above thanks to CREATE EXTENSION.

The only real issue with hstore is that not all client drivers support converting hstore values to native maps or dictionaries, so you might need to get the properties for an entry as a table using the hstore sql function each, effectively converting the hstore to subtable result.

The only real alternatives I see are:

A simple master-detail table in EAV style. That'll work fine, it's just a bit annoying to query, you'll be using a lot of string_agg or array_agg calls and it's lots slower than hstore. This approach is completely portable across database systems.
Storing the properties as xml, which can be easier to parse from some applications and is sort-of indexable using functional indexes on xpath expressions. This approach is somewhat portable across database systems, in that it should work with any SQL/XML database.
Storing the properties as json, which is easy to consume from most applications but is currently non-indexable and pretty opaque to SQL at the moment (expect improvements in future Pg versions). This approach is somewhat PostgreSQL-specific, but you can use a text/memo/clob field to store json in any database.

As for re-using properties that're already defined, that'a a simple matter of maintaining a table of known property names. If you want you can generate it on the fly using SELECT DISTINCT skeys(properties) FROM thetable; queries on the properties hstores, but that'll be quite slow so I'd recommend maintaining a separate table.

Postgresql – Design DB for users with different information fields

You did not mention whether this was an OLTP or Decision Support solution; lets assume OLTP.

Basically, you are asking about superclass (the set of common attributes) and subclass (the set of different attributes) by some classification.

In your example you have indicated that the sub-class can have many values. When the number of sub-classes are known and static, you could model it as follows:

enter image description here

The DDL for the User table is as follows:

CREATE TABLE User (
UserId INTEGER DEFAULT nextval('UserId_seq')  NOT NULL,
UserName CHARACTER VARYING(40)  NOT NULL,
UserLname CHARACTER VARYING(256),
UserType CHARACTER VARYING(10),
CONSTRAINT PK_User PRIMARY KEY (UserId));

The UserType attribute is new and is used to identify which subclass you are storing for a given user.

For your CS Subclass example

insert into user (UserName, UserLname, UserType)
       values  ('Ali', 'Alipour', 'CS');

-- > Assume that the assigned UserId is = 1 for the purposes of this example

insert into cs_user_projects (user_id, projectName) 
       values  (1, 'p1');
insert into cs_user_projects (user_id, projectName) 
       values  (1, 'p3');
insert into cs_user_projects (user_id, projectName) 
       values  (1, 'p6');
       ...

Now the application would know that it had to go to the CS_USER tables to get the information for CS users as follows:

select UserName
     , UserLname
     , ProjectName
  from user u
  join cs_user_projects p
    on u.userid = p.userid
 where u.UserType = 'CS';

If you want multiple sub-class information (User Programming Skills), you have to join to that sub-class table as well.

This is a straight forward, simple design; however a potential drawback is that it is a static design. If you wish to represent another sub-class (Engineers, Nurses, ...); then the sub-class tables must be added and the application changed to handle the new sub-class use cases.

In order to allow the model to take on new sub-classes without a change in the data model; you must think of the sub-class as a data object and model the relationship in the following way:

enter image description here

The User table (super-class) stays the same and you still have a classification described by UserType; but now you have an Attrib table to hold the values for each sub-class of user.

As an attribute can belong to many users and a user can have many attributes, the relationship between users and attributes is called a many-to-many relationship. This is resolved by addition of the UserAttrib table with primary keys from each parent table. The Attribute Role is provided for context of the relationship. Let's see how this would load for your CS example again:

    -- identical to before
    insert into user (UserName, UserLname, UserType)
       values  ('Ali', 'Alipour', 'CS');

    -- Load up the different projects 
    insert into Attrib ('p1');
    insert into Attrib ('p2');
    insert into Attrib ('p3');
    insert into Attrib ('p4');
    insert into Attrib ('p5');
    insert into Attrib ('p6');

    -- Assign only the appropriate projects to Ali
    insert into userattrib( AttribValue, Userid, AttribRole)
       values ('p1',1,'PROJECT');
    insert into userattrib( AttribValue, Userid, AttribRole)
       values ('p3',1,'PROJECT');
    insert into userattrib( AttribValue, Userid, AttribRole)
       values ('p6',1,'PROJECT');

To query our little data model as before:

select UserName
     , UserLname
     , AttribValue Project
  from user1 u
  join UserAttrib a
    on u.UserId = a.UserId
 where a.AttribRole = 'PROJECT'
   and u.UserType   = 'CS';

Note that to return multiple attributes (e.g. also return User Programming Skills say), you would have to join to the UserAttrib table again. The benefit in this design is that if a new sub-class comes along (Engineers who specialize in Electrical, Chemical, Mechanical, ... sub-classifications). Then you would simply add data for these classifications and assignments as above without having to change the data-model to do so.

These are the classical, general solutions to super-class/sub-class design for ALL Databases that you mention in your question.

In the Postgres Database case, you should investigate the INHERITS clause of CREATE TABLE.

It allows a table (sub-class) to inherit the parent (super-class) table definition. In addition, you can alter the child (sub-class) tables to add additional attributes to achieve your desired effect of tables which would match on a sub-class by sub-class basis. Note that this implementation method is very similar to the first method we discussed (static, known sub-classes) and would not apply in all design criteria (e.g. dynamic and/or swiftly changing sub-classes).

Best Answer

Related Solutions

Mysql – Model with variable number of “properties” of different types.

Postgresql – Design DB for users with different information fields

Related Question