Avoiding Unique Violation in Atomic Transactions in PostgreSQL

postgresqltransactionunique-constraint

Is possible to create atomic transaction in PostgreSQL?

Consider I have table category with these rows:

id|name
--|---------
1 |'tablets'
2 |'phones'

And column name has unique constraint.

If I try:

BEGIN;
update "category" set name = 'phones' where id = 1;
update "category" set name = 'tablets' where id = 2;
COMMIT;

I'm getting:

ERROR:  duplicate key value violates unique constraint "category_name_key"
DETAIL:  Key (name)=(tablets) already exists.

Best Answer

In addition to what @Craig provided (and correcting some of it):

Effective Postgres 9.4, UNIQUE, PRIMARY KEY and EXCLUDE constraints are checked immediately after each row when defined NOT DEFERRABLE. This is different from other kinds of NOT DEFERRABLE constraints (currently only REFERENCES (foreign key)) which are checked after each statement. We worked all of this out under this related question on SO:

Constraint defined DEFERRABLE INITIALLY IMMEDIATE is still DEFERRED?

It is not enough for a UNIQUE (or PRIMARY KEY or EXCLUDE) constraint to be DEFERRABLE to make your presented code with multiple statements work.

And you can not use ~~ALTER TABLE ... ALTER CONSTRAINT~~ for this purpose. Per documentation:

ALTER CONSTRAINT

This form alters the attributes of a constraint that was previously created. Currently only foreign key constraints may be altered.

Bold emphasis mine. Use instead:

ALTER TABLE t
   DROP CONSTRAINT category_name_key
 , ADD  CONSTRAINT category_name_key UNIQUE(name) DEFERRABLE;

Drop and add the constraint back in a single statement so there is no time window for anybody to sneak in offending rows. For big tables it would be tempting to conserve the underlying unique index somehow, because it is costly to delete and recreate it. Alas, that does not seem to be possible with standard tools (if you have a solution for that, please let us know!):

Drop primary key without dropping an index

For a single statement making the constraint deferrable is enough:

UPDATE category c
SET    name = c_old.name
FROM   category c_old
WHERE  c.id     IN (1,2)
AND    c_old.id IN (1,2)
AND    c.id <> c_old.id;

A query with CTEs also is a single statement:

WITH x AS (
    UPDATE category SET name = 'phones' WHERE id = 1
    )
UPDATE category SET name = 'tablets' WHERE id = 2;

However, for your code with multiple statements you (additionally) need to actually defer the constraint - or define it as INITIALLY DEFERRED Either is typically more expensive than the above. But it may not be easily feasible to pack everything into one statement.

BEGIN;
SET CONSTRAINTS category_name_key DEFERRED;
UPDATE category SET name = 'phones'  WHERE id = 1;
UPDATE category SET name = 'tablets' WHERE id = 2;
COMMIT;

Be aware of a limitation in connection with FOREIGN KEY constraints, though. Per documentation:

The referenced columns must be the columns of a non-deferrable unique or primary key constraint in the referenced table.

So you cannot have both at the same time.

Edit

Once you find the offending duplicates, should you consider that the first occurrence of each case is good enough (e.g. trivial differences in description or coordinate fields), you can use DISTINCT ON:

INSERT into ny_stations (id, name, latitude, longitude)
SELECT DISTINCT ON (start_station_id) start_station_id, start_station_name, start_station_latitude, start_station_longitude
FROM ny_raw_trips
WHERE start_station_id NOT IN (SELECT id FROM ny_stations)
ORDER BY start_station_id;

PostgreSQL – Update Multiple Rows and Skip Row on Error

It seems there is a UNIQUE INDEX on table A(user_id) and you're trying to assign an existing value to more than one row.

Have a look at the next example:

create table a (id int, user_id int, f_id int);
create table b (id int, f_id int);
create unique index A_user_id_key on a(user_id);

insert into a 
values (1, 1, 11),
       (2, 3, 22),
       (3, 2, 33),
       (4, 5, 44 );

insert into b
values (1, 11),
       (3, 22),
       (3, 33), --<<<< there is another row with user_id = 3
       (4, 44);

When I try to update using your current query:

update a
set    user_id = b.id
from   b
where  a.f_id = b.f_id;

It returns same error message:

ERROR: duplicate key value violates unique constraint "a_user_id_key"
DETAIL: Key (user_id)=(3) already exists.

You could solve by avoiding existing values:

update a
set    user_id = b.id
from   b
where  a.f_id = b.f_id
and    not exists(select 1 from a where a.user_id = b.id);

This is the result:

select * from a;

id | user_id | f_id
-: | ------: | ---:
 1 |       1 |   11
 2 |       3 |   22
 3 |       2 |   33
 4 |       4 |   44

db<>fiddle here

But I'll suggest to check wich are the duplicate rows using next query:

select a.*
from   a
join   b
on     a.f_id = b.f_id
and    exists(select 1 from a where user_id = b.id and f_id <> b.f_id);

id | user_id | f_id
-: | ------: | ---:
 3 |       2 |   33

db<>fiddle here

Best Answer

Related Solutions

PostgreSQL – Duplicate Key Value Violates Unique Constraint

Edit

PostgreSQL – Update Multiple Rows and Skip Row on Error

Related Question