Bash – shell functions and variables with the same name

bashenvironment-variablesfunction

From Bash Manual:

Note that shell functions and variables with the same name may result in multiple identically-named entries in the environment
passed to the shell’s children. Care should be taken in cases where
this may cause a problem.

How can bash distinguish "shell functions and variables with the
same name" ?

$  func () { return 3; }; func=4; declare -p func; declare -f func;
declare -- func="4"
func () 
{ 
  return 3
}

When does "multiple identically-named entries in the environment
passed to the shell’s children" happen?

What "care" should be taken for what problem?

Best Answer

The general story: separate namespaces

Generally shells distinguish between variables and functions because they're used in different contexts. In a nutshell, a name is a variable name if it appears after a $, or as an argument to builtins such as export (without -f) and unset (without -f). A name is a function name if it appears as a command (after alias expansion) or as an argument to export -f, unset -f, etc.

Variables can be exported to the environment. The name of the environment variable is the same as the shell variable (and the values are the same too).

With older bash: confusion due to function export

Bash, unlike most other shells, can also export functions to the environment. Since there's no type indication in the environment, there's no way to recognize whether an entry in the environment is a function or not, other than by analyzing the name or the value of the environment variable.

Older versions of bash stored a function in the environment using the function's name as the name, and something that looks like the function definition as the function's value. For example:

bash-4.1$ foobar () { echo foobar; }
bash-4.1$ export -f foobar
bash-4.1$ env |grep -A1 foobar
foobar=() {  echo foobar
}
bash-4.1$ 

Note that there's no way to distinguish a function whose code is { echo foobar; } from a variable whose value is () { echo foobar␤} (where is a newline character). This turned out to be a bad design decision.

Sometimes shell scripts get invoked with environment variables whose value is under control of a potentially hostile entity. CGI scripts, for example. Bash's function export/import feature allowed injecting functions that way. For example executing the script

#!/bin/bash
ls

from a remote request is safe as long as the environment doesn't contain variables with a certain name (such as PATH). But if the request can set the environment variable ls to () { cat /etc/passwd; } then bash would happily execute cat /etc/passwd since that's the body of the ls function.

With newer bash: confusion mostly alleviated

This security vulnerability was discovered by Stéphane Chazelas as one of the aspects of the Shellshock bug. In post-Shellshock versions of bash, exported functions are identified by their name rather than by their content.

bash-4.3$ foobar () { echo foobar; }
bash-4.3$ export -f foobar
bash-4.3$ env |grep -A1 foobar
BASH_FUNC_foobar%%=() {  echo foobar
}

There is no security issue now because names like BASH_FUNC_foobar%% are not commonly used as command names, and can be filtered out by interfaces that allow passing environment variables. It's technically possible to have a % character in the name of an environment variable (that's what makes modern bash's exported functions work), but normally people don't do this because shells don't accept % in the name of a variable.

The sentence in the bash manual refers to the old (pre-Shellshock) behavior. It should be updated or removed. With modern bash versions, there is no ambiguity in the environment if you assume that environment variables won't have a name ending in %%.

Related Question