Postgresql – Is LC_COLLATE different from COLLATE

collationpostgresql

Postgres 12 on Windows 7

This works:

CREATE DATABASE "foo"
    ENCODING 'UTF8'
    LC_COLLATE = 'en-US'
    LC_CTYPE = 'en-US'
    TEMPLATE template0

Afterwards, running SHOW LC_COLLATE with database foo selected reports:

screenshot

But when running

SELECT UPPER('kedi' COLLATE "en-US")

it reports:

ERROR: collation "en-US" for encoding "UTF8" does not exist

Shouldn't both accept the same collation names?

Best Answer

LC_COLLATE refers to a name of locale from the operating system, whereas COLLATE refers to a collation that should exist in pg_catalog.pg_collation.

pg_catalog.pg_collation is originally populated in the template databases when the PostgreSQL instance is created (by initdb).

Specifically it's the SQL function pg_import_system_collations() that should do that. From the doc (emphasis mine)

pg_import_system_collations ( schema regnamespace ) → integer

Adds collations to the system catalog pg_collation based on all the locales it finds in the operating system. This is what initdb uses; see Section 23.2.2 for more details. If additional locales are installed into the operating system later on, this function can be run again to add collations for the new locales. Locales that match existing entries in pg_collation will be skipped. (But collation objects based on locales that are no longer present in the operating system are not removed by this function.) The schema parameter would typically be pg_catalog, but that is not a requirement; the collations could be installed into some other schema as well. The function returns the number of new collation objects it created.

For some reason you don't have en-US in pg_collation while it's valid from your operating system, so you seem to be in the case where the advice is to run pg_import_system_collations() manually in the databases that need it.