Postgresql – How to copy a PostgreSQL database at a filesystem level between servers

postgresqlpostgresql-9.5

I have a large PostgreSQL database on a server with a failing disk. I cannot run pg_dump because of disk errors.

I have another server with the identical OS (Ubuntu 16.04) and the identical version of PostgreSQL (9.5.6). How do I copy the raw database files to the new server? I don't care if read errors corrupt parts of the database, as I have ways of identifying and fixes those issues, but I need the data copied over before I can fix it. From what I can tell, there aren't many read errors, but even one will stop pg_dump.

There's a scant wiki page on this topic, and it suggests doing a "filesystem level copy" but makes no mention of how to actually do this.

Edit: I used rsync to copy over my data directory (/var/lib/postgresql/9.5/main) and configuration directory (/etc/postgresql/9.5/main). Running sudo service postgresql start on the new server runs without error, but trying to connect via sudo -u postgres psql returns:

psql: could not connect to server: No such file or directory
    Is the server running locally and accepting
    connections on Unix domain socket "/var/run/postgresql/.s.PGSQL.5432"?

and the log shows:

2017-05-23 15:13:08 EDT [14604-1] [unknown]@[unknown] LOG:  incomplete startup packet
2017-05-23 15:13:08 EDT [14603-2] LOG:  MultiXact member wraparound protections are now enabled
2017-05-23 15:13:08 EDT [14602-1] LOG:  database system is ready to accept connections
2017-05-23 15:13:08 EDT [14608-1] LOG:  autovacuum launcher started
2017-05-23 15:13:43 EDT [14602-2] LOG:  received fast shutdown request
2017-05-23 15:13:43 EDT [14602-3] LOG:  aborting any active transactions
2017-05-23 15:13:43 EDT [14608-2] LOG:  autovacuum launcher shutting down
2017-05-23 15:13:43 EDT [14605-1] LOG:  shutting down
2017-05-23 15:13:45 EDT [14605-2] LOG:  database system is shut down
2017-05-23 21:33:29 EDT [27345-1] FATAL:  could not load server certificate file "/etc/ssl/certs/ssl-cert-snakeoil.pem": No such file or directory

What is this missing certificate and how do I fix it?

Edit: I've fixed the SSL error by following these instructions, and now all my files seem to have been transferred.

My database seems to be largely accessible. The only issue I've found is when I go to do a full vacuum, I get the error:

ERROR:  could not open file "base/106800/107273": No such file or directory

How do I fix this? I realize it's file corruption, probably caused by that file not being readable during the rsync. How do I "fill it in" with blank values so the vacuum can proceed? The error occurs when vacuuming a specific table, and the data in that table can be regenerated.

Best Answer

This is the script I ended up having to write to transfer all the files, meant to be run from the destination server.

#!/bin/bash
set -i

# Ensure databases on both servers are stopped.
sudo service postgresql stop
ssh -t myuser@oldserver "sudo service postgresql stop"

# Ensure our user owns postgres's files so our rsync call can access them.
ssh -t myuser@oldserver "sudo chown -R myuser:myuser /etc/postgresql"
ssh -t myuser@oldserver "sudo chown -R myuser:myuser /var/lib/postgresql"

# Copy down all the files.
sudo rsync --progress -azv --delete myuser@oldserver:/etc/postgresql/9.5/main/ /etc/postgresql/9.5/main
sudo rsync --progress -azv --delete myuser@oldserver:/var/lib/postgresql/9.5/main/ /var/lib/postgresql/9.5/main

# Fix permissions.
ssh -t myuser@oldserver "sudo chown -R postgres:postgres /etc/postgresql"
ssh -t myuser@oldserver "sudo chown -R postgres:postgres /var/lib/postgresql"
sudo chown -R postgres:postgres /etc/postgresql
sudo chown -R postgres:postgres /var/lib/postgresql

# Fix a missing ssl cert that pg uses for connections.
cd /etc/ssl/certs
sudo make-ssl-cert generate-default-snakeoil --force-overwrite

# Bring the server back up.
sudo service postgresql start

Rsync reported some errors, but I'm browsing my databases via psql and pgadmin3 and so far everything looks correct.

The fix for the SSL error I found here.