Postgresql – Dump PostgreSQL schema with foreign keys and referenced data

dumpforeign keypostgresqlschema

The official PostgreSQL documentation states that pg_dump can use the --schema=SCHEMA option to selectively dump a given schema.

However according to the same documentation (read Note on --schema=SCHEMA) foreign-keys to objects outside the selected schema will be included in the dump, but not the tables which they refer to.

With that said, I would like to know how to dump a given schema in a way that it includes all the needed tables regardless of the schema they are in and that this is done in a way that is safe/easy to restore the dump.

Additionally, are there any tools available that do it already?

Thanks

Best Answer

I am not aware of any tools to do exactly what you are asking right now. I want to suggest a few options you might have overlooked however that may meet your needs.

Database Cloning

If your database is not being used 24/7 you can clone it for test cases. This would be done using during times when the db is not in use:

createdb -U postgres new_dbname -T old_dbname

This does a file-level copy of the database, and so it preserves all data and is much faster than a dump and reload.

I very frequently use this technique for cases where I want to run test cases on one db where I can't guarantee that everything will roll back.

Base Backup and Restore

A second thing you can do if you want to bring all data onto another server is to set it up for replication as a master with at least one wal_sender, and use pg_base_backup to clone your db cluster over the network. Note this is a file-based clone, and so it will be a very network-intensive process (more than a dump) but restore will be a lot faster.

Writing your own Tool

One of the big difficulties in having a generally available tool is that cross-schema dependencies could be quite complex and handling all possible complexity is a bit of a daunting task. However for specific purposes this isn't too hard. I would recommend three steps, using pg_dump to get you most of the way there, and then having a script written in one language or another to add what we can hope are modest missing pieces. I think you could even do it with an sql script run through psql.

  1. Dump schemas only of tables you need. Create new file. Something like: pg_dump -t mytable1 -t mytable2 -t mytable3 > my_partial_dump.sql

  2. Export selected data to csv, and wrap using COPY statements. This could be done using psql along with echo and COPY TO STDOUT. This is appended to the partial dump.

  3. Dump the schema as you are, and append this to the partial dump.

The advantage of a purpose-built tool if this is what you need is that you can manually manage the complex part of it and thus have a smaller, simpler tool than you might otherwise require.