PostgreSQL Primary Key – How to Handle Duplicates in UPDATE

duplicationpostgresqlprimary-key

Let's assume a scenario with the entities Person and Company and a table PersonCompanyStocks that models how many stocks a persons owns of a certain company (N:M cardinality). For example:

person | company | num_stocks
-----------------------------
Alice  | foo     | 300
Bob    | foo     | 100
Bob    | bar     | 200

This table uses (person, company) as a primary key to guarantee unique entries, and foreign keys to the respective person/company table (I'm only using string IDs for simplicity).

Now let's assume company bar buys company foo. We want to update the table in a way that it becomes:

person | company | num_stocks
-----------------------------
Alice  | bar     | 300
Bob    | bar     | 300

Looking only at Alice's record suggests to use a naive approach like:

UPDATE
  PersonCompanyStocks
SET
  company = "bar"
WHERE
  company = "foo"

However this update fails with duplicate key value violates unique constraint ... because for Bob there already is a row with the key ("Bob", "bar"). For INSERT's Postgres supports ON CONFLICT DO ..., but it looks like there is no equivalent for UPDATE. And clearly we also have to deal with properly merging the num_stock value of the two rows.

What is the best strategy to approach this problem? I only see a relatively ugly solution:

  • One query to determine the duplicates.
  • One UPDATE to merge the duplicates into the final row.
  • One DELETE to remove the offending duplicates.
  • The above UPDATE to do the renaming in rows without duplicates.

This feels complex and is probably prone to race conditions. Does Postgres offer any trick to solve this more elegantly?

Best Answer

We need move stocks between companies, right? What means we need add stocks to existing users and change company for new users. Or, what the same, we could delete all stocks of company "foo" and insert .. on conflict new rows for company "bar"

with rows as (                                             
  delete from PersonCompanyStocks 
    where company = 'foo'
    returning person, num_stocks
)
insert into PersonCompanyStocks (person, company, num_stocks) 
  select person, 'bar', num_stocks from rows
  on conflict(person,company) do update set 
    num_stocks = PersonCompanyStocks.num_stocks + excluded.num_stocks;

Transactional, no race conditions here thanks to row locking during delete.