Postgresql – Email address unique or primary key

postgresqlprimary-keyunique-constraint

I am a newbie in databases. I read around and found out that it's probably not a great idea to use email address as primary key because string comparisons are slower which effects performance in complex joins and if an email changes I'd have to change all the foreign keys which requires a lot of effort.

But if my users table requires every user to have an email address and each of those email address should be unique, will adding a unique index on the email column suffice? Because afaik unique fields allow null values, whereas I require every user to have an email address, not allowing null values. Is there something I'm missing here? Or I'm suppose to make email column unique and make sure during data validation on the server that user does enter an email address so that every user has one?

Best Answer

Let's first distinguish between keys and indexes, key is part of the logical model and is often implemented with an unique index. You can however create a unique index without creating a key, but that can not be referenced by a foreign key.

A candidate key is something that uniquely identifies a row in a table, in SQL one of the candidate keys are normally used as a primary key (I never really understood why one of the c.k. is concidered "better" than the others, but that's another story), and the remaining c.k becomes unique constraints.

A unique constraint can be used the same way as a primary key can. Consider:

create table A ( x ... not null
               , y ... not null
               , z ... not null
               ,     unique (x)
               ,     primary key (y,z) );

create table B ( x ...
               ,   ...
               ,     foreign key (x) references A (x) );

create table C ( y ...
               , z ...
               ,   ...
               ,     foreign key (y, z) references A (y, z) );  

B references the unique constraint and C references the primary key constraint.

NOT NULL is yet another kind of constraint. In your case you can enforce this for email without declaring it unique.

The next aspect of your post concerns the stability of a key, a key should be stable (but that doesn't mean it can never change, it does not have to be immutable). Some DBMS implements ON UPDATE CASCADE that can be of help for such operation, still if the key is distributed around your model it will be a pain updating it.

In your case I would probably choose another candidate key as the primary key, and declare email to be NOT NULL and UNIQUE.