Argument Options – Why Did ‘Argument Can Be Squished Against Option’ Prevail?

historyoptions

Inspired by the recent question Why does the specific sequence of options matter for tar command?, in which the asker learned why tar -cfv test.tar *.jpg doesn't work, I'd like to ask a followup: seriously, why not?

When a command has an option -f that requires an argument and an option -v that doesn't, this:

cmd -fv foo

can be interpreted in 2 different ways: the v is the argument for the -f option and foo is a non-option argument, or foo is the argument for the -f option and the -v option is present. The first interpretation is what POSIX getopt() does, so there are lots of commands that behave that way.

I always preferred the second interpretation. Packing all the options together (regardless of whether they take arguments) seems more useful than squishing the foo up against the -f to turn -f foo into -ffoo. But this behavior barely exists anymore. The only command I've used lately that does it is Java's jar (which has a syntax clearly inspired by that Sun version of tar which accepts tar cfv tarfile ...).

Xlib has a getopt-like function, XrmParseCommand, which allows options to be specified as either taking "separate" args or "sticky" args. But it deals with long options (-display, -geometry, etc.) so it sees -fv as just another option with no relation to either -f or -v. So it's not an example of my second interpretation.

When and why did squished args become dominant? Was it already settled before POSIX, or did the POSIX mandate decide the issue? Did the first version of POSIX even have the same specific requirement as the current version? Is there any archived discussion of the subject from ancient times?

Are there any other commands (besides tar and jar) that support or have historically supported the -fv foo = -f foo -v style of option parsing?

Best Answer

First of all, in standard getopt()-style argument processing, arguments don't have to be squished against the option they apply to, they just can be. So if -f takes an argument, both of the following are valid:

command -ffoo positional arguments
command -f foo positional arguments

What you call the "second interpretation" is in fact very, very rare. As far as I can think of right now, tar and mt are the only extant commands that works that way... and, as you mention, jar, but that's only because it emulates tar. These commands process arguments very differently from the standard getopt()-style. The options are not even preceeded by -! I can't say for sure why it was rarely used, but I would guess that it's because of the fact that it's harder to tell what options go with what arguments. For example, if b and d take arguments but a, c, and e don't, then you have this:

command abcde b-argument d-argument

...which means that while you are composing the command you have to look back at the option letter group, read it again, remember which options you specified require arguments, and write out the arguments in the same order. What about this?

command adcbe b-argument d-argument

Oops, the d option got the b-argument and vice versa. Worse, if you see:

command lmnop foo bar baz

...and you are not familiar with the command, you have no idea which arguments go with which options. foo, bar, and baz might be arguments to n, o, p (and l and m take no arguments) or foo and bar might go with, say m and p while baz is a positional parameter... or many other possible combinations.

Related Solutions

Debian – Why did Debian create the DFSG

The early definition of free software (set forth in the GNU’s Bullentin Volume 1, Number 1 in 1986) was unknown to the authors of the Debian Free Software Guidelines in 1997. This early definition was much weaker than the DFSG and it seems that The Free Software Definition had not yet been published as such.

Here is an excerpt from a comment by Bruce Perens (the primary author of DFSG) (found as a reference in Wikipedia’s Debian Free Software Guidelines article):

Richard wrote a statement of the Four Freedoms in an early edition of the GNUs Bulletin, which was mostly distributed in paper form on the MIT campus and environs. He did not further promote them until a long time later. So, when I had to write license guidelines for Debian, the Four Freedoms document was unknown. …

Much later, FSF published its statement of the Four Freedoms on its web site as an alternative to the Open Source Definition.

In fact, the 1986 GNU’s Bulletin definition was not the modern “Four Freedoms”, but a simplified version that focuses on the abilities to redistribute and change programs (but not specifically the ability to redistribute changed programs!). This early definition is close to the “modern” freedoms two and one.

The DFSG were first published in the July 1997 announcement of the Debian “Social Contract”. It explicitly mentions the ability to redistribute modified source code (or at least “original plus patches”). This was not explicit in the early GNU’s Bulletin definition, though it is related to “modern” freedom three.

archive.org’s http://www.gnu.org/philosophy/free-sw.html

January 1998 - first archived version; (unnumbered) freedoms one through three
April 1999 - added freedom zero
May 2001 - first version called “The Free Software Definition”

Command Line – What is a Non-Option Argument?

The terminology is not completely fixed, so different documentation uses different terms, or worse, the same terms with different meanings. The terminology in the man page you're reading is a common one. It is the one used in the POSIX standard. In a nutshell, each word after the command is an argument, and the arguments that start with - are options.

Argument

In the shell command language, a parameter passed to a utility as the equivalent of a single string in the argv array created by one of the exec functions. An argument is one of the options, option-arguments, or operands following the command name.

Operand

An argument to a command that is generally used as an object supplying information to a utility necessary to complete its processing. Operands generally follow the options in a command line.

Option

An argument to a command that is generally used to specify changes in the utility's default behavior.

“Utility” is what is generally called “command” (the standard uses the word utility to avoid ambiguity with the meaning of “command” that includes the arguments or even compound shell commands).

Most commands follow the standard utility argument syntax, where options start with a - (dash a.k.a. minus). So an option is something like -a (short option, follows the POSIX guidelines) or --all (long option, an extension from GNU). A non-option argument is an argument that doesn't begin with -, or that consists solely of - (which who treats as a literal file name but many commands treat as meaning either standard input or standard output).

In addition, some options themselves have an argument. This argument can be passed in several ways:

For a single-letter option, in the same argument to the utility: foo -obar: bar is the argument to the single-letter option -o.
In the GNU long argument syntax, in the same argument, separated by an equal sign: foo --option=bar.
In a separate argument: foo -o bar or foo --option bar. If the option -o (or --option) takes an argument, then bar is the argument of the option -o (or --option). If -o (or --option) does not take an argument then bar is an operand.

Here's a longer example:

tail -n 3 myfile

-n is an option, 3 is an argument to the option -n, and myfile is an operand.

Terminology differs, so you may find documents that use argument in the sense where POSIX uses operand. But “non-option argument” is more common than either term for this meaning.

Best Answer

Related Solutions

Debian – Why did Debian create the DFSG

Command Line – What is a Non-Option Argument?

Argument

Operand

Option

Related Question