Shell – Why do `env var=value` allow arbitrary name in var

environment-variablesshell

Reading env POSIX documentation:

Some have suggested that env is redundant since the same effect is
achieved by:

name=value … utility [ argument … ]

The example is equivalent to
env when an environment variable is being added to the environment of
the command, but not when the environment is being set to the given
value. The env utility also writes out the current environment if
invoked without arguments. There is sufficient functionality beyond
what the example provides to justify inclusion of env.

AFAICT, the above statement meaning var=value command will be the same as env var="value" command, and not when using as env -i var="value" command.

Now, at least with env implementation on GNU system, FreeBSD and Solaris 11, I realize that they're not equivalent, because env allow any characters, except = and \0 in var name:

$ env 'BASH_FUNC_foo%%=() { echo foo; }' bash -c foo

print foo, while you can't use BASH_FUNC_foo%%='() { echo foo; }' in any shells, because BASH_FUNC_foo%% clearly not a valid variable name.

In POSIX shells, except bash, this left a variable named BASH_FUNC_foo%% in environment variables, which the shell can not access it.

So, what is the purpose of allowing arbitrary name in form env var=value and was it allowed by POSIX?

Best Answer

So, what is the purpose of allowing arbitrary name in form env var=value and was it allowed by POSIX?

Quoting from POSIX: Environment Variables:

Environment variable names used by the utilities in the Shell and Utilities volume of POSIX.1-2008 consist solely of uppercase letters, digits, and the ( '_' ) from the characters defined in Portable Character Set and do not begin with a digit. Other characters may be permitted by an implementation; applications shall tolerate the presence of such names.

Note: Other applications may have difficulty dealing with environment variable names that start with a digit. For this reason, use of such names is not recommended anywhere.

So implementations of env may permit arbitrary environment variable names - and most, if not all, implementations do so, accepting every non-NUL character to the left of an '=' - and implementations of other utilities (such as the shell) may or may not permit arbitrary names.

The statement that name=value ... utility is equivalent to env var="value" utility will only be true if the implementation of env and the shell both permit name to be an environment variable.

There's an interesting Austin Group thread about this issue here: Invalid shell assignments in environment. One point mentioned is that shells generally only allow environment variables whose names can be represented as shell variables. Several participants in that thread participate in unix.stackexchange.com and can hopefully add some more info about the issue.