I have found a clever way to do this query without a self join.
I ran these commands in MySQL 5.5.8 for Windows and got the following results:
use test
DROP TABLE IF EXISTS answers;
CREATE TABLE answers (user_id VARCHAR(10),question_id INT,answer_value VARCHAR(20));
INSERT INTO answers VALUES
('Sally',1,'Pouch'),
('Sally',2,'Peach'),
('John',1,'Pooch'),
('John',2,'Duke');
INSERT INTO answers VALUES
('Sally',1,'Pooch'),
('Sally',2,'Peach'),
('John',1,'Pooch'),
('John',2,'Duck');
SELECT user_id,question_id,GROUP_CONCAT(DISTINCT answer_value) given_answers
FROM answers GROUP BY user_id,question_id;
+---------+-------------+---------------+
| user_id | question_id | given_answers |
+---------+-------------+---------------+
| John | 1 | Pooch |
| John | 2 | Duke,Duck |
| Sally | 1 | Pouch,Pooch |
| Sally | 2 | Peach |
+---------+-------------+---------------+
This display reveals that John gave two different answers to question 2 and Sally gave two different answers to question 1.
To catch which questions were answered differently by all users, just place the above query in a subquery and check for a comma in the list of given answers to get the count of distinct answers as follows:
SELECT user_id,question_id,given_answers,
(LENGTH(given_answers) - LENGTH(REPLACE(given_answers,',','')))+1 multianswer_count
FROM (SELECT user_id,question_id,GROUP_CONCAT(DISTINCT answer_value) given_answers
FROM answers GROUP BY user_id,question_id) A;
I got this:
+---------+-------------+---------------+-------------------+
| user_id | question_id | given_answers | multianswer_count |
+---------+-------------+---------------+-------------------+
| John | 1 | Pooch | 1 |
| John | 2 | Duke,Duck | 2 |
| Sally | 1 | Pouch,Pooch | 2 |
| Sally | 2 | Peach | 1 |
+---------+-------------+---------------+-------------------+
Now just filter out rows where multianswer_count = 1 using another subquery:
SELECT * FROM (SELECT user_id,question_id,given_answers,
(LENGTH(given_answers) - LENGTH(REPLACE(given_answers,',','')))+1 multianswer_count
FROM (SELECT user_id,question_id,GROUP_CONCAT(DISTINCT answer_value) given_answers
FROM answers GROUP BY user_id,question_id) A) AA WHERE multianswer_count > 1;
This is what I got:
+---------+-------------+---------------+-------------------+
| user_id | question_id | given_answers | multianswer_count |
+---------+-------------+---------------+-------------------+
| John | 2 | Duke,Duck | 2 |
| Sally | 1 | Pouch,Pooch | 2 |
+---------+-------------+---------------+-------------------+
Essentially, I performed three table scans: 1 on the main table, 2 on the small subqueries. NO JOINS !!!
Give it a Try !!!
Best Answer
Assuming question has just one correct answer and possible answers don't have to be different for each set of tests, you can do something like (it's a rough idea, I might under (or over) engineer some concepts):
Question : id, question_text, correct_answer_id (FK to answer.id)
Answer : id, answer_text
Question_possible_answer (question_id (FK to question), answer_id(FK to anwer))
Test : id, test_name
Test_question : id, test_id(FK to Test), Question (FK to Question)
QUESTIONNAIRE (id, customer_id, test_id, questionnaire_date) -- I assume that a customer may pass test more than once, so no unique key on (customer_id, test_id), but rather on (customer_id, test_id,questionnaire_date)
QUESTIONNAIRE_Answer (questionnaire_id (FK to QUESTIONNAIRE), test_question_id(FK to Test_question), answer_id(FK to Answer))
Now for each question you add some possible answers in Question_possible_answer table, then you create a new test and populate Test_question table with the questions. You create a new QUESTIONNAIRE by linking test and customer, all answers from this customer go to QUESTIONNAIRE_Answer table.