PostgreSQL Default Collation Not Working

collationpostgresql

I don't understand why the default collation on my psqlserver 11 instance returns a different result than using with explicit collation.

Details:

show LC_CTYPE;        -- de-AT
show LC_COLLATE;      -- de-AT

When I use the default collation I get an unexpected output:

demo=# SELECT unnest(array['a','B', 'A']) order by 1;
 unnest 
--------
 A
 B
 a
(3 rows)

When I add the default collation explicitly, I get the expected output:

demo=# SELECT unnest(array['a','B', 'A']) collate "de-AT-x-icu" ORDER BY 1;
 unnest 
--------
 a
 A
 B
(3 rows)

The collation de-AT-x-icu uses exactly the default values for LC_COLLATE and LC_COLTYPE shown above:

SELECT collname, collcollate, collctype
FROM pg_collation
where collname like '%de-AT%';

  collname   | collcollate | collctype 
-------------+-------------+-----------
 de-AT-x-icu | de-AT       | de-AT

So I think both queries should return the same result: What am I missing?

More info:

  • postgresql version 11.5
  • OS inside the Docker container:
    Linux 8660fb4cef84 4.9.184-linuxkit #1 SMP Tue Jul 2 22:58:16 UTC 2019 x86_64 Linux

Best Answer

Let me guess: you are using Alpine Linux, which uses musl as its C library, which uses ICU collations by default.

It is known that the collations in Alpine Linux don't work quite as they should, which is the cause of the problem. I am not sure if that is PostgreSQL's fault or not.

At any rate, you should use a different Linux distribution that uses glibc.