Sql-server – the correct model for related tables with millions of rows

database-designperformancequery-performancesql server

I need to create a question and answer tables that will have millions (maybe billions) of rows.

The current model is:

Question Table
id_question (PK, auto increment)
id_user
question_content
question_date

Answer Table
id_answer (PK, auto increment)
id_question
id_user
answer_content
answer_date

Is this a correct model (considering better query performance)? Should I add the id_question and id_user columns to the primary key?

Thanks

Best Answer

You're right. Your tables would look something like this then ( oracle ddl ) :

create table Users (id int,
                    name varchar2(1000),
                    constraint pk_users primary key (id)
);

create table Questions (id int,
                        user_id int,
                        content CLOB,
                        question_date date,
                        constraint pk_questions primary key (id),
                        constraint fk_questions_user_id foreign key (user_id) references users(id)
                        );

create table Answers (id int,
                      question_id int,
                      user_id int,
                      content CLOB,
                      answer_date date,
                      constraint pk_answers primary key (id),
                      constraint fk_answers_questions foreign key (question_id) references Questions(id), 
                      constraint fk_answers_user_id foreign key (user_id) references users(id)
);