I've started working on a project where the application data is split across multiple databases that all have the same schema (basically there's a database per "organisation").
After some discussion it's been decided that this isn't the most efficient way for us to store data and would like to merge all of these separate databases into one single database.
I was wondering what would be the best approach. I'm currently thinking about doing something like the following:
-
Dump the schema from one of the databases and restore it to the new primary database:
pg_dump [OLD_DATABASE] --schema-only > dbschema.sql
psql -d [NEW_DATABASE -f dbschema.sql
-
Dump the data from all of the old databases disabling triggers, and insert format:
pg_dump [OLD_DATABASES] --disable-triggers --data-only --inserts > old_db_data.sql
-
Restore all of the dumped data from database files:
psql [NEW_DATABASE] < old_db_data.sql
Is this the right way to do this kind of merge or am I completely off the mark? Additionally will this handle foreign key dependencies correctly?
Best Answer
Your method is easily improved, however you're mostly there.
When you run this
Instead do
Then load your dumped data, produced by this (from all the servers)
But, then you need to add the add the constraints.. So go back to your original database and run
And, then load post.sql. This will add the constraints.