Bash – How does `env X='() { (a)=>\’ sh -c “echo date”` work

bashshellshock

After reading about the latest bash vulnerability, I was wondering how Tavis Ormandy's exploit works. How does (a)=>\ work?

He posted:

The bash patch seems incomplete to me, function parsing is still brittle. e.g.
$ env X='() { (a)=>\' sh -c "echo date"; cat echo

Best Answer

GNU Bash exports shell functions in environment variables which include the function definitions:

$ function foo { echo bar; }
$ export -f foo
$ env | grep -A1 foo
foo=() { echo bar
}

When a new Bash instance is spawned, it looks for environment variables matching a certain pattern. The contents of those variables are automatically imported as shell functions. As Stéphane Chazelas explains, since this feature was introduced in Bash 1.03, importing of functions was done simply by replacing the = in the corresponding entry in the environment variable array, and interpreting the result as a function definition. Prior to the patch that fixed CVE-2014-6271, the environment variable was interpreted in its entirety, including any commands that follow the actual function body. The patch introduces two special modes to the parse_and_execute() function, SEVAL_FUNCDEF and SEVAL_ONECMD. When the function is called with SEVAL_FUNCDEF, it is supposed to prevent the interpretation of commands other than function definitions. The SEVAL_ONECMD flag is supposed to prevent the function from evaluation more than a single command.

Tavis Ormandy's specially crafted environment variable does something subtly different. It is designed to confuse the parser and corrupt the buffer used to the store the commands to be evaluated. Remnants of the environment variable in the buffer change the interpretation of the subsequent command. This related issue has received the CVE identifier CVE-2014-7169.

The constituents of environment variable definition X='() { (a)=>\' are:

() { which is interpreted by the parser as the beginning of a function definition
(a)= is intended to confuse the parser and cause it to leave remnants of the environment variable in the buffer
>\ is the actual payload that is left in the buffer

The purpose of the payload is to change the interpretation of the command executed in the subshell invoked by sh -c "echo date";. This of course assumes that /bin/sh is a symbolic link to bash. When the command string specified as an operand to -c is placed in the buffer, the contents of the buffer are:

>\[0xA]echo date

The [0xA] is an ASCII newline character, which would normally act as a command separator, but is now escaped by the \ from the payload. As a result, the contents of the buffer are interpreted as

>echo date

Because Bash allows redirection operators to precede commands, this is equivalent to

date > echo

This simply causes the date command to be executed with its standard output redirected to a file called echo. The remaining cat echo is not part of the exploit, it only demonstrates that now a file called echo containing the output of date exists.

As to why the string (a)= confuses the parser in this case, it would seem to that the it is related to it appearing as a (malformed) nested function definition. The simplified variant of the exploit demonstrates this more clearly:

$ X='() { function a a>\' bash -c echo
$ ls echo
echo

TL;DR

The shellshock vulnerability is fully fixed in

On the bash-2.05b branch: 2.05b.10 and above (patch 10 included)
On the bash-3.0 branch: 3.0.19 and above (patch 19 included)
On the bash-3.1 branch: 3.1.20 and above (patch 20 included)
On the bash-3.2 branch: 3.2.54 and above (patch 54 included)
On the bash-4.0 branch: 4.0.41 and above (patch 41 included)
On the bash-4.1 branch: 4.1.14 and above (patch 14 included)
On the bash-4.2 branch: 4.2.50 and above (patch 50 included)
On the bash-4.3 branch: 4.3.27 and above (patch 27 included)

If your bash shows an older version, your OS vendor may still have patched it by themselves, so best is to check.

If:

env xx='() { echo vulnerable; }' bash -c xx

shows "vulnerable", you're still vulnerable. That is the only test that is relevant (whether the bash parser is still exposed to code in any environment variable).

Details.

The bug was in the initial implementation of the function exporting/importing introduced on the 5^th of August 1989 by Brian Fox, and first released in bash-1.03 about a month later at a time where bash was not in such widespread use, before security was that much of a concern and HTTP and the web or Linux even existed.

From the ChangeLog in 1.05:

Fri Sep  1 18:52:08 1989  Brian Fox  (bfox at aurel)

       * readline.c: rl_insert ().  Optimized for large amounts
         of typeahead.  Insert all insertable characters at once.

       * I update this too irregularly.
         Released 1.03.
[...]
Sat Aug  5 08:32:05 1989  Brian Fox  (bfox at aurel)

       * variables.c: make_var_array (), initialize_shell_variables ()
         Added exporting of functions.

Some discussions in gnu.bash.bug and comp.unix.questions around that time also mention the feature.

It's easy to understand how it got there.

bash exports the functions in env vars like

foo=() {
  code
}

And on import, all it has to do is interpret that with the = replaced with a space... except that it should not blindly interpret it.

It's also broken in that in bash (contrary to the Bourne shell), scalar variables and functions have a different name space. Actually if you have

foo() { echo bar; }; export -f foo
export foo=bar

bash will happily put both in the environment (yes entries with same variable name) but many tools (including many shells) won't propagate them.

One would also argue that bash should use a BASH_ namespace prefix for that as that's env vars only relevant from bash to bash. rc uses a fn_ prefix for a similar feature.

A better way to implement it would have been to put the definition of all exported variables in a variable like:

BASH_FUNCDEFS='f1() { echo foo;}
  f2() { echo bar;}...'

That would still need to be sanitized but at least that could not be more exploitable than $BASH_ENV or $SHELLOPTS...

There is a patch that prevents bash from interpreting anything else than the function definition in there (https://lists.gnu.org/archive/html/bug-bash/2014-09/msg00081.html), and that's the one that has been applied in all the security updates from the various Linux distributions.

However, bash still interprets the code in there and any bug in the interpreter could be exploited. One such bug has already been found (CVE-2014-7169) though its impact is a lot smaller. So there will be another patch coming soon.

Until a hardening fix that prevents bash to interpret code in any variable (like using the BASH_FUNCDEFS approach above), we won't know for sure if we're not vulnerable from a bug in the bash parser. And I believe there will be such a hardening fix released sooner or later.

Edit 2014-09-28

Two additional bugs in the parser have been found (CVE-2014-718{6,7}) (note that most shells are bound to have bugs in their parser for corner cases, that wouldn't have been a concern if that parser hadn't been exposed to untrusted data).

While all 3 bugs 7169, 7186 and 7187 have been fixed in following patches, Red Hat pushed for the hardening fix. In their patch, they changed the behaviour so that functions were exported in variables called BASH_FUNC_myfunc() more or less preempting Chet's design decision.

Chet later published that fix as an official upstreams bash patch.

That hardening patch, or variants of it are now available for most major Linux distribution and eventually made it to Apple OS/X.

That now plugs the concern for any arbitrary env var exploiting the parser via that vector including two other vulnerabilities in the parser (CVE-2014-627{7,8}) that were disclosed later by Michał Zalewski (CVE-2014-6278 being almost as bad as CVE-2014-6271) thankfully after most people had had time to install the hardening patch

Bugs in the parser will be fixed as well, but they are no longer that much of an issue now that the parser is no longer so easily exposed to untrusted input.

Note that while the security vulnerability has been fixed, it's likely that we'll see some changes in that area. The initial fix for CVE-2014-6271 has broken backward compatibility in that it stops importing functions with . or : or / in their name. Those can still be declared by bash though which makes for an inconsistent behaviour. Because functions with . and : in their name are commonly used, it's likely a patch will restore accepting at least those from the environment.

Why wasn't it found earlier?

That's also something I wondered about. I can offer a few explanations.

First, I think that if a security researcher (and I'm not a professional security researcher) had specifically been looking for vulnerabilities in bash, they would have likely found it.

For instance, if I were a security researcher, my approaches could be:

Look at where bash gets input from and what it does with it. And the environment is an obvious one.
Look in what places the bash interpreter is invoked and on what data. Again, it would stand out.
The importing of exported functions is one of the features that is disabled when bash is setuid/setgid, which makes it an even more obvious place to look.

Now, I suspect nobody thought to consider bash (the interpreter) as a threat, or that the threat could have come that way.

The bash interpreter is not meant to process untrusted input.

Shell scripts (not the interpreter) are often looked at closely from a security point of view. The shell syntax is so awkward and there are so many caveats with writing reliable scripts (ever seen me or others mentioning the split+glob operator or why you should quote variables for instance?) that it's quite common to find security vulnerabilities in scripts that process untrusted data.

That's why you often hear that you shouldn't write CGI shell scripts, or setuid scripts are disabled on most Unices. Or that you should be extra careful when processing files in world-writeable directories (see CVE-2011-0441 for instance).

The focus is on that, the shell scripts, not the interpreter.

You can expose a shell interpreter to untrusted data (feeding foreign data as shell code to interpret) via eval or . or calling it on user provided files, but then you don't need a vulnerability in bash to exploit it. It's quite obvious that if you're passing unsanitized data for a shell to interpret, it will interpret it.

So the shell is called in trusted contexts. It's given fixed scripts to interpret and more often than not (because it's so difficult to write reliable scripts) fixed data to process.

For instance, in a web context, a shell might be invoked in something like:

popen("sendmail -oi -t", "w");

What can possibly go wrong with that? If something wrong is envisaged, that's about the data fed to that sendmail, not how that shell command line itself is parsed or what extra data is fed to that shell. There's no reason you'd want to consider the environment variables that are passed to that shell. And if you do, you realise it's all env vars whose name start with "HTTP_" or are well known CGI env vars like SERVER_PROTOCOL or QUERYSTRING none of which the shell or sendmail have any business to do with.

In privilege elevation contexts like when running setuid/setgid or via sudo, the environment is generally considered and there have been plenty of vulnerabilities in the past, again not against the shell itself but against the things that elevate the privileges like sudo (see for instance CVE-2011-3628).

For instance, bash doesn't trust the environment when setuid or called by a setuid command (think mount for instance that invokes helpers). In particular, it ignores exported functions.

sudo does clean the environment: all by default except for a white list, and if configured not to, at least black lists a few that are known to affect a shell or another (like PS4, BASH_ENV, SHELLOPTS...). It does also blacklist the environment variables whose content starts with () (which is why CVE-2014-6271 doesn't allow privilege escalation via sudo).

But again, that's for contexts where the environment cannot be trusted: any variable with any name and value can be set by a malicious user in that context. That doesn't apply to web servers/ssh or all the vectors that exploit CVE-2014-6271 where the environment is controlled (at least the name of the environment variables is controlled...)

It's important to block a variable like echo="() { evil; }", but not HTTP_FOO="() { evil; }", because HTTP_FOO is not going to be called as a command by any shell script or command line. And apache2 is never going to set an echo or BASH_ENV variable.

It's quite obvious some environment variables should be black-listed in some contexts based on their name, but nobody thought that they should be black-listed based on their content (except for sudo). Or in other words, nobody thought that arbitrary env vars could be a vector for code injection.

As to whether extensive testing when the feature was added could have caught it, I'd say it's unlikely.

When you test for the feature, you test for functionality. The functionality works fine. If you export the function in one bash invocation, it's imported alright in another. A very thorough testing could have spotted issues when both a variable and function with the same name are exported or when the function is imported in a locale different from the one it was exported in.

But to be able to spot the vulnerability, it's not a functionality test you would have had to do. The security aspect would have had to be the main focus, and you wouldn't be testing the functionality, but the mechanism and how it could be abused.

It's not something that developers (especially in 1989) often have at the back of their mind, and a shell developer could be excused to think his software is unlikely to be network exploitable.

Bash displays international characters as escape sequences

You can thank someone named Lino Miguel Martins Tinoco from 2004 for this one.

The GNU Readline documentation for .inputrc does not allow in-line comments. Both it and the GNU Bourne Again shell manual say:

Lines beginning with a `#' are comments.

The line

set output-meta on     # Enable Meta output with eighth bit set

is not a line beginning with #. It's a line with a # in the middle. As Lino Miguel Martins Tinoco found, this results in the output-meta option being off, not on, as evident in the output of bind -V when xe ran it:

output-meta is set to `off'

.inputrc is not shell script. As said in the Linux From Scratch tutorial

Note that comments cannot be on the same line as commands.

Best Answer

Related Solutions

Shellshock Bug – When Was the Shellshock Bug Introduced and What Is the Patch That Fully Fixes It?

TL;DR

Details.

Edit 2014-09-28

Why wasn't it found earlier?

Bash displays international characters as escape sequences

Related Question