SELECT or Set-Returning Version of MERGE/UPSERT in PostgreSQL 10

postgresql-10selectupsert

I'm learning MERGE or UPSERT (UPDATE + INSERT), which according to wikipeda, has the form:

 MERGE INTO tablename USING table_reference ON (condition)
   WHEN MATCHED THEN
   UPDATE SET column1 = value1 [, column2 = value2 ...]
   WHEN NOT MATCHED THEN
   INSERT (column1 [, column2 ...]) VALUES (value1 [, value2 ...]);

, and:

If the ON field(s) in the Source matches the ON field(s) in the
Target, then UPDATE

If the ON field(s) in the Source does not match the ON field(s) in the
Target, then INSERT

I was wondering if there is a SELECT version of it (without side effects) that returns the would-be result of the updated target table without actually changing the target table?

Using PostgreSQL 9.5's INSERT INTO ... ON CONFLICT [ conflict_target ] conflict_action statement (pointed out in wikipedia) as an example. Given a target table distributors and an update table distributors_upd as follows:

DROP TABLE IF EXISTS distributors;
DROP TABLE IF EXISTS distributors_upd;

CREATE TABLE distributors (did INTEGER PRIMARY KEY, dname VARCHAR);
INSERT INTO distributors
VALUES (5, 'Gizmo Transglobal'), (6, 'Associated Computing, Inc');

CREATE TABLE distributors_upd (did INTEGER PRIMARY KEY, dname VARCHAR);
INSERT INTO distributors_upd
VALUES (5, 'Gizmo Transglobal*');

Is there an effective query to get the result of a UPSERT without actually changing the target table distributors?

I tried to mimic the effect with:

WITH distributors_updated AS (
INSERT INTO distributors (did, dname)
    (TABLE distributors_upd)
    ON CONFLICT (did) DO UPDATE SET dname = EXCLUDED.dname
    RETURNING *
)
--SELECT * FROM distributors;
SELECT * FROM distributors_updated;

Neither one of the last two lines gives the would-be result:

 did |           dname           
-----+---------------------------
   6 | Associated Computing, Inc
   5 | Gizmo Transglobal*

The first ... SELECT * FROM distributors shows the original table (by design).

-----+---------------------------
   5 | Gizmo Transglobal
   6 | Associated Computing, Inc
(2 rows)

The second ... SELECT * FROM distributors_updated shows the change rows only.

-----+--------------------
   5 | Gizmo Transglobal*
(1 row)

The content of distributors is changed either way. Is there an effective query returning the "merged"/"unioned" without actually changing the target distributors?

Best Answer

If you want to find the results of the "UPSERT" without actually performing it, you can run a query with FULL JOIN:

SELECT 
    did,
    COALESCE(s.dname, t.dname) AS dname   -- in case of conflict, 
                                          -- give precedence to source table
FROM 
    distributors AS t
    FULL JOIN distributors_upd AS s
    USING (did) ;

Test at dbfiddle.uk (thnx to McNets for setting it up)

The columns that would be updated on the conflict by the merge will have the source column first in the COALESCE() function and the target column second. If the update is more complex, you'll have to use a more complex COALESCE() or CASE expression. Example:

ON CONFLICT UPDATE 
    SET counter = counter + EXCLUDED.counter

will become

SELECT
    COALESCE(t.counter + s.counter, t.counter, s.counter) AS counter