In a pre-production environment running django 1.6 on PostgreSQL 9.1 we had a problem with database records arriving in a different order from what the application was expecting. So we ran this command as the postgres user:
update pg_database set datcollate='C' where datname='appname';
About a week later it was suddenly impossible to log into the application as a user which did exist in the database and had been unchanged for years. Attempts at looking up the user by username fails but looking up the same user by primary key worked, and the returned username was in fact correct.
That problem was fixed by reindexing the database like this:
REINDEX DATABASE appname;
Is it possible that the update
command has broken the indexes? If that update command is indeed unsafe, then what is the safe way to change collation on a PostgreSQL database?
In case it is relevant, these are the queries that django generates for looking up the user by username or primary key:
SELECT "auth_user"."id", "auth_user"."password", "auth_user"."last_login", "auth_user"."is_superuser", "auth_user"."username", "auth_user"."first_name", "auth_user"."last_name", "auth_user"."email", "auth_user"."is_staff", "auth_user"."is_active", "auth_user"."date_joined" FROM "auth_user" WHERE "auth_user"."username" = username
SELECT "auth_user"."id", "auth_user"."password", "auth_user"."last_login", "auth_user"."is_superuser", "auth_user"."username", "auth_user"."first_name", "auth_user"."last_name", "auth_user"."email", "auth_user"."is_staff", "auth_user"."is_active", "auth_user"."date_joined" FROM "auth_user" WHERE "auth_user"."id" = 1
Best Answer
The
datcollate
column ofpg_database
storesAn other page of the documentation about collations says:
That means that by updating that column you managed to cross some river that should not be crossed under any circumstances.
On the other hand, as collations affect the ordering of
text
,varchar
andvarchar
typed columns (and possibly some others, check the documentation page for details). This also means that the existing indexes on such columns will be ordered differently from what theLC_COLLATE
dictates. This will result of some weird behaviour, and as you've seen, at least aREINDEX
will be needed. I would expect that doing awould help you finding the user by username, because it would do a sequential scan instead of an index scan, so not trying to use the unusable index.
I cannot tell if this is enough to fix your DB, but I would do the following instead:
For the future, you cannot change the DB level collation, but you can define collation on the column level.
One can even change the collation on a column like
where you keep the type as it is, just set a different collation.
Just a final notice: changing the system catalogs directly should always be a last resort. Before doing that (probably except adding new enum values in pre-9.1 versions), one should carefully examine the possible outcomes from the documentation, blog or mailing list posts and testing.