I've read a lot from both camps in the surrogate vs. natural key debate. What I am struggling to understand, and what seems to be glossed over, especially in explaining how surrogate keys are more efficient, is how do referencing tables resolve a surrogate key?
Say, I am using an ORM in an application, and have a userID
in an application object. now we want to insert into a SelectedTask
table and a SubTask
table that are defined as follows (ignore the awkwardness of the real word example):
CREATE TABLE SelectedTask(
userID INT References User(userID)
taskNumber INT,
name TEXT ,
jobSiteID INT References Site(jobSiteID),
PRIMARY KEY (???)
);
CREATE TABLE SubTask(
assignee INT References User(assignee),
subTaskID INT References Subtasks(subTaskID),
timeAllowed INT,
reportTo INT References Group(reportTo),
-- how do we reference SelectedTask?
);
If we use a surrogate key for the SelectedTask
table, how do we use that in the SubTask
table? If we use natural keys, the primary key could simply be a compound primary key of userId
and taskNumber
, and we could insert into both at the same time assuming we have programmatically gathered all of the pieces.
Best Answer
A proper Postgres schema for your example could look like this:
Use a
serial
column as surrogate PK.You can't use reserved words like
user
orgroup
as identifier.Some other improvements. Detailed instructions here:
INSERT
in both tables at the same time?Use a data-modifying CTE with the
RETURNING
clause, or twoINSERT
statements in a transaction withlastval()
(which is the Postgres equivalent forSCOPE_IDENTITY()
in SQL server @Aaron mentioned in his comment):Related answer with code example for either method.