Postgresql – Restore PostgreSQL database with same locale settings as source database

pg-restorepostgresql

I have a PostreSQL 9.x database on an Ubuntu 14.04 LTS production machine.
My development machine is Windows 7-based, providing PostreSQL 9.y.

I want to restore the Ubuntu PostreSQL database on my development machine.

I noticed that the Ubuntu database uses the following locale settings:

  • character set encoding=UTF8
  • collation order, string sort order=en_US.UTF-8
  • character type, character classification=en_US.UTF-8

When I restore the database on the Windows machine without specifying locales, it will be set up with

  • character set encoding=UTF8
  • collation order, string sort order=German_Germany.1252
  • character type, character classification=German_Germany.1252

My plan is to have the database on my development machine as similar as possible to the database on my Ubuntu machine. So my idea was to first create a database on my Windows machine with production-matching locales, then restore my Ubuntu database prod-db.backup backup file into that created database:

createdb --host=localhost --username=postgres --encoding=Unicode --lc-collate=en_US.UTF-8 --lc-ctype=en_US.UTF-8 --owner=prod prod-db
pg_restore --host=localhost --username=postgres --format=custom prod-db.backup --dbname=prod-db

This ideas does not work as the createdb on Windows will complain with the error invalid locale name en_US.UTF-8.

I went on a hunt to find the Windows locale names that match the Ubuntu locale names, including a solution to use the template0 template database, experimenting with different locale identifiers such as en_US.UTF-8 and en_us_utf8 I found scattered in the Internet … but no solution works.

  • Is there a locale identifier for Windows that matches Ubuntu's en_US.UTF-8?
  • Or is locale German_Germany.1252 identical (enough) to en_US.UTF-8 so that I can stick to it and not worry about locales – I want to make sure that database queries behave identical when it comes to aspects such as query result set ordering.

Best Answer

These are the exact steps I take to import a copy of a PostgreSQL 9.5 database (exported on Linux using en_US.UTF-8 encoding) into PostgreSQL 9.3 on Windows 7 or PostgreSQL 9.5 on Windows 8.1. You need to create the database with the appropriate encoding prior to loading the SQL file, otherwise the encoding from Linux (en_US.UTF-8) will prompt Windows to use a default encoding.

REM WARNING: Console code page (65001) differs from Windows code page (1252) 8-bit characters might not work correctly. See psql reference page "Notes for Windows users" for details.
chcp 1252

C:\PostgreSQL\9.5\bin\psql.exe -U postgres -d postgres_password -c "CREATE DATABASE my_db WITH TEMPLATE=template0 ENCODING='UTF-8' LC_COLLATE='american_usa' LC_CTYPE='american_usa'"

REM Grant privileges to the appropriate database user.
C:\PostgreSQL\9.5\bin\psql.exe -U postgres -d postgres_password -c "GRANT ALL PRIVILEGES ON DATABASE my_db TO myuser"

REM Import my_db.sql. Ignore the following error (it comes from a line in the SQL file, but we have already created the database with the correct locale, so we are ok):
REM psql:my_db.sql:22: ERROR:  invalid locale name: "en_US.UTF-8"

C:\PostgreSQL\9.5\bin\psql.exe -o nul --quiet -U postgres -d postgres_password -f my_db.sql

As discussed here, the locale on Windows should be american_usa (on Windows 7). On Windows 8.1, either american_usa or en-US (not en_US as with Unix) will work.