Postgresql: How to avoid encoding issues when copying a schema from one server to another

pg-dumppg-restorepostgresql

I'm using pg_dump and pg_restore to move a schema from one Postgresql 9.5 server to another. On the destination server:

$ pg_dump -h source.example.com -n my_schema -v --no-owner -F c -f my_schema.dump
perl: warning: Setting locale failed.
perl: warning: Please check that your locale settings:
    LANGUAGE = (unset),
    LC_ALL = (unset),
    LC_CTYPE = "UTF-8",
    LANG = (unset)
    are supported and installed on your system.
perl: warning: Falling back to the standard locale ("C").
...
pg_dump: saving encoding = UTF8
(dump completes with no other errors or warnings)

$ pg_restore -h 127.0.0.1 -e -v --no-owner -d my_db my_schema.dump
...
perl: warning: Falling back to the standard locale ("C").
...
pg_restore: [archiver (db)] Error while PROCESSING TOC:
pg_restore: [archiver (db)] Error from TOC entry 2211; 0 6549333 TABLE DATA mention chicken
pg_restore: [archiver (db)] COPY failed for table "mention": ERROR:  invalid byte sequence for encoding "UTF8": 0xcd 0x2e

Any idea on how to solve this issue? What I want is an exact binary copy of the data. There seems to be some encoding problem which makes me nervous, since what is restored may not be exactly the same as the dump, even if I don't get any errors.

Best Answer

This looks like you are migrating from one OS to another (like Linux / Windows or something like it), because different OSes have different names for locales.

One way I would do it is to find out the exact locale name on the target system which corresponds to the one on the source, then use a text dump (the "normal" output of pg_dump) and on the target system, use psql to log in and set the "client encoding" to whatever the locale name from the first step is, then load the dump with \i.