PostgreSQL ON CONFLICT – ON CONFLICT … WHERE Condition Doesn’t Seem to Work

postgresqlquery

We're trying to populate a new table for a new feature with information from an old table, but we are going to be running this query as a script multiple times because we want to catch any user updated data.

Table 1 Schema.

table1
id | email | first_name | last_name | job | phone | user_id | institution_id | institution_type | completion_state | created_at | updated_at | mobile_phone | deleted_at
----+-------+------------+-----------+-----+-------+---------+----------------+------------------+------------------+------------+------------+--------------+------------

Table 2 Schema.

table2
 id | contact_id | status | data | created_at | updated_at | source_updated_at
----+------------+--------+------+------------+------------+-------------------

This is my query


INSERT INTO table2 AS iic (contact_id, data, created_at, updated_at, source_updated_at)
SELECT
  cc.id,
  (
  SELECT row_to_json(_) FROM (
    SELECT
      cc.email,
      cc.first_name,
      cc.last_name,
      cc.job,
      cc.phone,
      cc.user_id,
      cc.institution_id,
      cc.institution_type,
      cc.completion_state,
      cc.updated_at,
      cc.mobile_phone,
      inst.type,
      inst.name,
      inst.description,
      inst.assets_under_management,
      inst.phone,
      add.id,
      add.street,
      add.street2,
      add.city,
      add.state,
      add.country,
      add.zip_code
      ) AS _
    ) AS data,
  current_timestamp,
  current_timestamp,
  cc.updated_at
FROM table1 cc
LEFT JOIN institutions inst
  ON cc.institution_id = inst.id
LEFT JOIN addresses add 
  ON add.addressable_id = inst.id
ON CONFLICT (contact_id) DO
  UPDATE SET status = 'updated',
             source_updated_at = excluded.updated_at
  WHERE iic.source_updated_at != excluded.updated_at;

So, we're copying Table 1 information into Table 2. The first time we run it, everything copies over perfectly, and the status column is null. If I then change the updated date for a single row on table1, I expect that the

ON CONFLICT ... UPDATE

would trigger and update table2 ONLY where the updated_at row has changed. Instead, it changes all status on table2 to updated.

I suspect that the WHERE clause is not row specific and applies to a larger set, in which case this condition will always return true. However, if someone can shed a light on how I can modify my query to return what I am looking for that would be great.

Please let me know if any of this requires clarification.

Cheers,

postgres v.9.6.11

Best Answer

With the help of @a_horse_with_no_name, I was able to find the error in my thinking. There is a great example in the PostgresSQL docs that really clarified it for me:

Insert or update new distributors as appropriate. Assumes a unique index has been defined that constrains values appearing in the did column. Note that the special excluded table is used to reference values originally proposed for insertion:

INSERT INTO distributors (did, dname)
    VALUES (5, 'Gizmo Transglobal'), (6, 'Associated Computing, Inc')
    ON CONFLICT (did) DO UPDATE SET dname = EXCLUDED.dname;

My original thought was that the EXCLUDED table used the same column names as the table I was trying to insert from. In actuality, it references the column names of the table in which I am trying to insert into, but the values are what were originally proposed for insertion.

So, the end of my query should have looked like this:

...
ON CONFLICT (contact_id) DO
  UPDATE SET status = 'updated',
             source_updated_at = EXCLUDED.source_updated_at
  WHERE iic.source_updated_at != EXCLUDED.source_updated_at;

Thank you for being my rubber duck, and I hope this helps someone else in the future!