PostgreSQL Upgrade – Converting Latin1 to UTF8

encodingpostgresql

I'm pretty new to PostgreSQL, but I've been tasked with doing an upgrade. The current system runs on CentOS 5.11 and is Postgres 8.1.23 with a default encoding of latin1. The new server is OpenSuSE 42.1 with Postgres 9.4.9 with a default encoding of utf8.

I've been trying to use the pg_dumpall command in order to get the users and their roles moved along with the databases/tables/data. But, I'm unable to restore to the new server because of the difference in encoding.

I see that the pg_dump command has the -E option to specify encoding, but it doesn't seem to give me a way to get users and roles. Would the -E option even accomplish what I'm needing — the conversion from latin1 to utf8?

What is the best way for me to migrate/upgrade users, roles, data, etc. and at the same time convert from latin1 to utf8?

Thanks!

Best Answer

Here are the steps you should perform:

  1. Dump roles and tablespaces with pg_dumpall and the --globals-only option
  2. Dump data and schemas with pg_dump and the --encoding option to have data in utf8 for each database
  3. Create each new database on new server with correct encoding
  4. Restore roles and tablespaces on new server
  5. Restore schemas and data for each new database

Here you will find documentation for pg_dump and pg_dumpall in 8.1: https://www.postgresql.org/docs/8.1/static/app-pgdump.html https://www.postgresql.org/docs/8.1/static/app-pg-dumpall.html

And here you will find documentation for pg_restore and psql (your restore tool depends on your dump options) in 9.4: https://www.postgresql.org/docs/9.4/static/app-pgrestore.html https://www.postgresql.org/docs/9.4/static/app-psql.html

And here is generic documentation about databases encoding in PostgreSQL: https://www.postgresql.org/docs/9.4/static/multibyte.html

Have a nice day!

Arkhena