Two dots (..) or two dashes (–) as a delimiter in the names of files and directories

filenames

Is it a good idea to use double dots or double minus signs as the delimiters? I'm trying to find a good naming convention for experimental scientific data. For example:

2017-12-11T19-45..JDoe-042..UO2(NO3)2-EtOAc_dist..150.3K..1.234mM.dat
2017-12-11T19-45--JDoe-042--UO2(NO3)2-EtOAc_dist--150.3K--1.234mM.dat

My reasons:

  1. To assure compatibility across platforms, the only suitable characters are _ - . and their combinations;
  2. None of them can be used on their own in my case:
    • _ is reserved for the spaces; due to case-sensitive chemical formulas I cannot use camelCase.
    • - is often a part of internal lab codes, plus it's being used as a replacement for a colon : in time (modified ISO 8601 notation) and ratios;
    • . is a decimal mark.
  3. Among their combinations the most popular, it seems, is _-_. However, this is 3 characters and the filenames are already pretty lengthy (as one can tell from the examples), so I'd like to stick with two characters if possible.
  4. Visually I find it's hard to quickly tell the difference between __ and _, whereas -- vs - and . vs .. are quite distinguishable to me.
  5. I haven't included comma , (as it has been rightfully suggested in the comments, this is also a viable character to consider) as I think it's easy to confuse it with a single dot ., which is already primarily reserved for the numerical values with a decimal point.

According to several posts across SE network, e.g.

I would assume both -- and .. are totally acceptable, and I'm thinking of finally choosing ... However, I'm not certain, especially regarding how regular expressions or python scripts can handle these files and folders (I have very little experience with both, but I'm learning).

Disregarding the behavior of specialized software, would you say these delimiters are generally safe for common file systems and scripting languages?

Best Answer

One of the more scrutinized and second-guessed design decisions in Unix/Linux is a file system feature that is working in your favor: any character is allowed in a file/directory name except for NUL \0 (ASCII 000) and slash / (the latter being reserved for file paths).

POSIX-compliant and/or well written programs and scripts will handle such lenience but, unfortunately, there are countless examples out there that don't. However, they tend to barf on a very particular set of characters and those characters are not dots or dashes. (Spaces and newlines are two of the most troublesome.) In fact, dots and dashes are very widely used. Common tools, languages and regular expressions will handle them fine...

...with one teeeeny little exception. (Of course, right?) I don't see any indication that you plan on doing this but it should be noted: avoid dashes at the beginning of a name. This is legal, of course, but there are too many programs in existence that will handle such names improperly resulting in them being interpreted as command-line options/flags. For example, if a script passes the filename to another script like this: some-script --my-dash-first-file ... then don't be surprised to see something like Unknown option '--my-dash-first-file'.

TL;DR Your proposed schemes are safe if you avoid names that begin with dash.

Additional word of caution: Though dots by themselves are common, especially to separate a file's base name from its "extension" (e.g. foo.txt), dots in pairs are usually seen alone...where they have special meaning: the parent directory of the current directory (..) or the preceding directory in a path (/foo/bar/../baz). So while this won't cause any technical issues double-dots in a name are a bit unconventional and may cause some users to do a double-take.


Related Question