Linux – How to find a locale which formats date and time the way I want

datelinuxlocale

My question seems like it should be simple but I'm amazed that I can't find the answer.

I live in Canada so en_CA.UTF8 or en_US.UTF8 are the locales that are most appropriate, however I really dislike the date/time formats they have. I agree with xkcd ( https://xkcd.com/1179/ ) that there is one way to format dates: yyyy-mm-dd. I've even gotten used to using 24-hour clock for time.

So this is a complete disaster for me, as formatted by en_US.UTF8: "Sat 27 Jan 2018 05:20:38 PM EST"

I'm sure must be a locale which uses variants of

%Y-%m-%d %H:%M:%S %Z

as appropriate for d_fmt, d_t_fmt, date_fmt etc. but I can't find a reference of which locales use which formats.

I've seen this question Best practice to customize date/time format system-wide? it had no responses so I chose to ask a new one since mine is slightly different, I don't want to customize locales, I want to be able to specify

LC_TIME=??

and get the "correct" format. In my case, the ideal answer is a reference document which details the formats used for various locales so I can choose the LC_TIME which is closest to what I want.

By the way, I've done the exercise of manually modifying en_CA.UTF8 with settings that I'm happy with but this is too time-consuming to do in multiple environments, there must be an easier way.

Best Answer

Locale data on most Linux systems is defined in the glibc source code; I’ll explain how to use that to look for a matching locale.

In a clone of the repository, go into localedata/locales; each locale is a separate file here, and the date format is determined by the d_fmt entries (d_t_fmt for date and time). These use %d, %m, and %y as you might expect, but often encoded as Unicode code-points; % is <U0025>, d is <U0064>, m is <U006D>, and y is <U0079>, with their uppercase counterparts 32 points on.

Running

grep -iE 'd_t_fmt.*(%|<U0025>)(y|<U00[57]9>).*(%|<U0025>)(m|<U00[46]D>).*(%|<U0025>)(d|<U00[46]4>)' *

lists a number of candidates, of which en_DK is likely to be the most appropriate:

$ LC_TIME=en_DK date +%x
2018-02-11
Related Question