Postgresql – Disable all constraints and table checks while restoring a dump

check-constraintsdatabase-designpg-dumppostgresqlpostgresql-9.1

I've obtained a dump of my PostgreSQL database with:

pg_dump -U user-name -d db-name -f dumpfile

which I then proceed to restore in another database with:

psql X -U postgres  -d db-name-b -f dumpfile

My problem is that the database contains referential constraints, checks and triggers and some of these (checks it would seem in particular) fail during restoration since the information is not loaded in the order that would cause those checks to be honored. For instance, inserting a row in a table may be associated with a CHECK that calls a plpgsql function that checks whether a condition holds in some other unrelated table. If that latter table is not loaded by psql before the former, an error occurs.

The following is a SSCCE that produces such a database that once dumped with pg_dump cannot be restored:

CREATE OR REPLACE FUNCTION fail_if_b_empty () RETURNS BOOLEAN AS $$
    SELECT EXISTS (SELECT 1 FROM b)
$$ LANGUAGE SQL;

CREATE TABLE IF NOT EXISTS a (
     i              INTEGER                    NOT NULL
);

INSERT INTO a(i) VALUES (0),(1);
CREATE TABLE IF NOT EXISTS b (
    i  INTEGER NOT NULL
);
INSERT INTO b(i) VALUES (0);

ALTER TABLE a ADD CONSTRAINT a_constr_1 CHECK (fail_if_b_empty());

Is there a way to disable (from the command line) all such constraints during dump restoration and enable them back again afterwards? I am running PostgreSQL 9.1.

Best Answer

So you look up other tables in a CHECK constraint.

CHECK constraints are supposed to run IMMUTABLE checks. What passes OK for a row at one time should pass OK at any time. That's how CHECK constraints are defined in the SQL standard. That's also the reason for this restriction in the manual:

Currently, CHECK expressions cannot contain subqueries nor refer to variables other than columns of the current row.

Still, expressions in CHECK constraints are allowed to use functions, even user-defined functions. Those should be IMMUTABLE, but Postgres does not currently enforce this. According to this related discussion on pgsql-hackers, one reason is to allow references to the current time, which is not IMMUTABLE by nature.

But you are looking up rows of another table, which is completely in violation of how CHECK constraints are supposed to work. I am not surprised that pg_dump fails to provide for this.

Move your check in another table to a trigger (which is the right tool), and it should work with modern versions of Postgres.

PostgreSQL 9.2 or later

While the above is true for any version of Postgres, several tools have been introduced with Postgres 9.2 to help with your situation:

pg_dump option --exclude-table-data

A simple solution would be to dump the db without data for the violating table with:

--exclude-table-data=my_schema.my_tbl

Then append just the data for this table at the end of the dump with:

--data-only --table=my_schema.my_tbl

But complications with other constraints on the same table might ensue. There is an even better solution:

Solution: NOT VALID

Up to Postgres 9.1, the NOT VALID modifier was only available for FK constraints. This was extended to CHECK constraints in Postgres 9.2. The manual:

If the constraint is marked NOT VALID, the potentially-lengthy initial check to verify that all rows in the table satisfy the constraint is skipped. The constraint will still be enforced against subsequent inserts or updates [...]

A plain Postgres dump file consists of three "sections":

  • pre_data
  • data
  • post-data

Postgres 9.2 also introduced an option to dump sections separately with -- section=sectionname, but that's not helping with the problem at hand.

Here is where it gets interesting. The manual:

Post-data items include definitions of indexes, triggers, rules, and constraints other than validated check constraints. Pre-data items include all other data definition items.

Bold emphasis mine.
You can change the offending CHECK constraint to NOT VALID, which moves the constraint to the post-data section. Drop and recreate:

ALTER TABLE a
  DROP CONSTRAINT a_constr_1
, ADD  CONSTRAINT a_constr_1 CHECK (fail_if_b_empty()) NOT VALID;

A single statement is fastest and rules out race conditions with concurrent transactions. (Two commands in a single transaction would work, too.)

This should solve your problem. You can even leave the constraint in that state, since that better reflects what it actually does: check new rows, but give no guarantees for existing data. There is nothing wrong with a NOT VALID check constraint. If you prefer, you can validate it later:

ALTER TABLE a VALIDATE CONSTRAINT a_constr_1;

But then you are back to the status quo ante.