Postgresql – In Postgres, How to merge two tables

postgresql

I have a set of web services that all need updating with a new set of static data.

I can't use a distributed transaction, but would like to be able to ask each service to rollback it's changes if I detect something went wrong.

I aim to ask each service to backup the data I care about, and apply the update… then later ask the service to restore the backup if I need to recover.

To complicate things I may only be backing up some of the data in the table (let's say.. where the ID starts with "My." so "My.123" will be backedup and restored but "YOURS.345" won't be changed by this process at all.

I'd like to make backup and restore functions that take a table name.

I was thinking of backing up tables with a generic version of the following code:

--=== BACKUP ===--
create table utilities.labpreferences_snapshot ( LIKE utilities.labpreferences EXCLUDING DEFAULTS ) WITHOUT OIDS;

-- copy data
insert into utilities.labpreferences_snapshot
select * from utilities.labpreferences where s.id like 'My.%';

When restoring, I am expecting that some rows may have been modified or deleted or new ones created. I want to make sure I don't break andy foreign keys referencing data in this table, but I don't expect any new foreign keys since the backup state was captured.

I could restore the data in the same way I copy it… but INSERT only works for data that was deleted… so.. I could delete all the data I want to replace..

--option1 
--=== RESTORE ===--
-- possibly stop foreign key constraint checking?

-- remove master data
DELETE FROM utilities.labpreferences s where s.id like 'My.%';

-- replace snapshot data
insert into utilities.labpreferences
select * from utilities.labpreferences_snapshot;

-- restart foreign key constraint checking somehow?

but I worry about the foreign keys.

I could delete the added ones, add the deleted ones, and update the modified ones.

--option2 
--=== RESTORE ===--

DELETE FROM utilities.labpreferences s where s.id like 'My.%' and s.id not in (select id from utilities.labpreferences_snapshot);

insert into utilities.labpreferences
select * from utilities.labpreferences_snapshot where s.id not in (select id from utilities.labpreferences);

Update ???  -- maybe something like http://stackoverflow.com/questions/3891758/how-to-update-one-table-from-another-one-without-specifying-column-names

The update looks hard.

I'm feeling a little out of my depth. Can anyone suggest a solution?

Best Answer

One approach could be to use a writable view to transform results in-query, eg

ALTER TABLE utilities.labpreferences
    RENAME TO utilities.old_labpreferences

CREATE TABLE utilities.labpreferences
( utilities.old_labpreferences EXCLUDING DEFAULTS )
WITHOUT OIDS;

CREATE OR REPLACE RULE "labpreferences_select" AS
  ON SELECT TO utilities.labpreferences
  DO INSTEAD
  SELECT -- fields with modifications --
    FROM utilities.old_labpreferences
    WHERE s.id like 'My.%'
  UNION
  SELECT -- fields without modification --
    FROM utilities.old_labpreferences
    WHERE s.id not like 'My.%'
    ;

Backing the changes out is as simple as removing the rule and new (empty) table, and renaming the old table back.

You will also need rules for update/insert/delete to this table - the postgres documentation for DO INSTEAD is pretty good on this front - see http://www.postgresql.org/docs/devel/static/rules-views.html