How Was the Shellshock Bash Vulnerability Found?

bashbugsshellshockvulnerability

Since this bug affects so many platforms, we might learn something from the process by which this vulnerability was found: was it an εὕρηκα (eureka) moment or the result of a security check?

Since we know Stéphane found the Shellshock bug, and others may know the process as well, we would be interested in the story of how he came to find the bug.

Best Answer

To reassure a few, I didn't find the bug by observing exploits, I have no reason to believe it's been exploited before being disclosed (though of course I can't rule it out). I did not find it by looking at bash's code either.

I can't say I remember exactly my train of thoughts at the time.

That more or less came from some reflection on some behaviours of some software I find dangerous (the behaviours, not the software). The kind of behaviour that makes you think: that doesn't sound like a good idea.

In this case, I was reflecting on the common configuration of ssh that allows passing environment variables unsanitised from the client provided their name starts with LC_. The idea is so that people can keep using their own language when sshing into other machines. A good idea until you start to consider how complex localisation handling is especially when UTF-8 is brought into the equation (and seeing how badly it's handled by many applications).

Back in July 2014, I had already reported a vulnerability in glibc localisation handling which combined with that sshd config, and two other dangerous behaviours of the bash shell allowed (authenticated) attackers to hack into git servers provided they were able to upload files there and bash was used as the login shell of the git unix user (CVE-2014-0475).

I was thinking it was probably a bad idea to use bash as the login shell of users offering services over ssh, given that it's quite a complex shell (when all you need is just parsing a very simple command line) and has inherited most of the misdesigns of ksh. Since I had already identified a few problems with bash being used in that context (to interpret ssh ForceCommands), I was wondering if there were potentially more there.

AcceptEnv LC_* allows any variable whose name starts with LC_ and I had the vague recollection that bash exported functions (a dangerous albeit at time useful feature) were using environment variables whose name was something like myfunction() and was wondering if there was not something interesting to look at there.

I was about to dismiss it on the ground that the worst thing one could do would be to redefine a command called LC_something which could not really be a problem as those are not existing command names, but then I started to wonder how bash imported those environment variables.

What if the variables were called LC_foo;echo test; f() for instance? So I decided to have a closer look.

$ env -i bash -c 'zzz() { :;}; export -f zzz; env'
[...]
zzz=() {  :
}

revealed that my recollection was wrong in that the variables were not called myfunction() but myfunction (and it's the value that starts with ()).

And a quick test:

$ env 'true;echo test; f=() { :;}' bash -c :
test
bash: error importing function definition for `true;echo test; f'

confirmed my suspicion that the variable name was not sanitized, and the code was evaluated upon startup.

Worse, a lot worse, the value was not sanitized either:

$ env 'foo=() { :;}; echo test' bash -c :
test

That meant that any environment variable could be a vector.

That's when I realised the extent of the problem, confirmed that it was exploitable over HTTP as well (HTTP_xxx/QUERYSTRING... env vars), other ones like mail processing services, later DHCP (and probably a long list) and reported it (carefully).

TL;DR

The shellshock vulnerability is fully fixed in

On the bash-2.05b branch: 2.05b.10 and above (patch 10 included)
On the bash-3.0 branch: 3.0.19 and above (patch 19 included)
On the bash-3.1 branch: 3.1.20 and above (patch 20 included)
On the bash-3.2 branch: 3.2.54 and above (patch 54 included)
On the bash-4.0 branch: 4.0.41 and above (patch 41 included)
On the bash-4.1 branch: 4.1.14 and above (patch 14 included)
On the bash-4.2 branch: 4.2.50 and above (patch 50 included)
On the bash-4.3 branch: 4.3.27 and above (patch 27 included)

If your bash shows an older version, your OS vendor may still have patched it by themselves, so best is to check.

If:

env xx='() { echo vulnerable; }' bash -c xx

shows "vulnerable", you're still vulnerable. That is the only test that is relevant (whether the bash parser is still exposed to code in any environment variable).

Details.

The bug was in the initial implementation of the function exporting/importing introduced on the 5^th of August 1989 by Brian Fox, and first released in bash-1.03 about a month later at a time where bash was not in such widespread use, before security was that much of a concern and HTTP and the web or Linux even existed.

From the ChangeLog in 1.05:

Fri Sep  1 18:52:08 1989  Brian Fox  (bfox at aurel)

       * readline.c: rl_insert ().  Optimized for large amounts
         of typeahead.  Insert all insertable characters at once.

       * I update this too irregularly.
         Released 1.03.
[...]
Sat Aug  5 08:32:05 1989  Brian Fox  (bfox at aurel)

       * variables.c: make_var_array (), initialize_shell_variables ()
         Added exporting of functions.

Some discussions in gnu.bash.bug and comp.unix.questions around that time also mention the feature.

It's easy to understand how it got there.

bash exports the functions in env vars like

foo=() {
  code
}

And on import, all it has to do is interpret that with the = replaced with a space... except that it should not blindly interpret it.

It's also broken in that in bash (contrary to the Bourne shell), scalar variables and functions have a different name space. Actually if you have

foo() { echo bar; }; export -f foo
export foo=bar

bash will happily put both in the environment (yes entries with same variable name) but many tools (including many shells) won't propagate them.

One would also argue that bash should use a BASH_ namespace prefix for that as that's env vars only relevant from bash to bash. rc uses a fn_ prefix for a similar feature.

A better way to implement it would have been to put the definition of all exported variables in a variable like:

BASH_FUNCDEFS='f1() { echo foo;}
  f2() { echo bar;}...'

That would still need to be sanitized but at least that could not be more exploitable than $BASH_ENV or $SHELLOPTS...

There is a patch that prevents bash from interpreting anything else than the function definition in there (https://lists.gnu.org/archive/html/bug-bash/2014-09/msg00081.html), and that's the one that has been applied in all the security updates from the various Linux distributions.

However, bash still interprets the code in there and any bug in the interpreter could be exploited. One such bug has already been found (CVE-2014-7169) though its impact is a lot smaller. So there will be another patch coming soon.

Until a hardening fix that prevents bash to interpret code in any variable (like using the BASH_FUNCDEFS approach above), we won't know for sure if we're not vulnerable from a bug in the bash parser. And I believe there will be such a hardening fix released sooner or later.

Edit 2014-09-28

Two additional bugs in the parser have been found (CVE-2014-718{6,7}) (note that most shells are bound to have bugs in their parser for corner cases, that wouldn't have been a concern if that parser hadn't been exposed to untrusted data).

While all 3 bugs 7169, 7186 and 7187 have been fixed in following patches, Red Hat pushed for the hardening fix. In their patch, they changed the behaviour so that functions were exported in variables called BASH_FUNC_myfunc() more or less preempting Chet's design decision.

Chet later published that fix as an official upstreams bash patch.

That hardening patch, or variants of it are now available for most major Linux distribution and eventually made it to Apple OS/X.

That now plugs the concern for any arbitrary env var exploiting the parser via that vector including two other vulnerabilities in the parser (CVE-2014-627{7,8}) that were disclosed later by Michał Zalewski (CVE-2014-6278 being almost as bad as CVE-2014-6271) thankfully after most people had had time to install the hardening patch

Bugs in the parser will be fixed as well, but they are no longer that much of an issue now that the parser is no longer so easily exposed to untrusted input.

Note that while the security vulnerability has been fixed, it's likely that we'll see some changes in that area. The initial fix for CVE-2014-6271 has broken backward compatibility in that it stops importing functions with . or : or / in their name. Those can still be declared by bash though which makes for an inconsistent behaviour. Because functions with . and : in their name are commonly used, it's likely a patch will restore accepting at least those from the environment.

Why wasn't it found earlier?

That's also something I wondered about. I can offer a few explanations.

First, I think that if a security researcher (and I'm not a professional security researcher) had specifically been looking for vulnerabilities in bash, they would have likely found it.

For instance, if I were a security researcher, my approaches could be:

Look at where bash gets input from and what it does with it. And the environment is an obvious one.
Look in what places the bash interpreter is invoked and on what data. Again, it would stand out.
The importing of exported functions is one of the features that is disabled when bash is setuid/setgid, which makes it an even more obvious place to look.

Now, I suspect nobody thought to consider bash (the interpreter) as a threat, or that the threat could have come that way.

The bash interpreter is not meant to process untrusted input.

Shell scripts (not the interpreter) are often looked at closely from a security point of view. The shell syntax is so awkward and there are so many caveats with writing reliable scripts (ever seen me or others mentioning the split+glob operator or why you should quote variables for instance?) that it's quite common to find security vulnerabilities in scripts that process untrusted data.

That's why you often hear that you shouldn't write CGI shell scripts, or setuid scripts are disabled on most Unices. Or that you should be extra careful when processing files in world-writeable directories (see CVE-2011-0441 for instance).

The focus is on that, the shell scripts, not the interpreter.

You can expose a shell interpreter to untrusted data (feeding foreign data as shell code to interpret) via eval or . or calling it on user provided files, but then you don't need a vulnerability in bash to exploit it. It's quite obvious that if you're passing unsanitized data for a shell to interpret, it will interpret it.

So the shell is called in trusted contexts. It's given fixed scripts to interpret and more often than not (because it's so difficult to write reliable scripts) fixed data to process.

For instance, in a web context, a shell might be invoked in something like:

popen("sendmail -oi -t", "w");

What can possibly go wrong with that? If something wrong is envisaged, that's about the data fed to that sendmail, not how that shell command line itself is parsed or what extra data is fed to that shell. There's no reason you'd want to consider the environment variables that are passed to that shell. And if you do, you realise it's all env vars whose name start with "HTTP_" or are well known CGI env vars like SERVER_PROTOCOL or QUERYSTRING none of which the shell or sendmail have any business to do with.

In privilege elevation contexts like when running setuid/setgid or via sudo, the environment is generally considered and there have been plenty of vulnerabilities in the past, again not against the shell itself but against the things that elevate the privileges like sudo (see for instance CVE-2011-3628).

For instance, bash doesn't trust the environment when setuid or called by a setuid command (think mount for instance that invokes helpers). In particular, it ignores exported functions.

sudo does clean the environment: all by default except for a white list, and if configured not to, at least black lists a few that are known to affect a shell or another (like PS4, BASH_ENV, SHELLOPTS...). It does also blacklist the environment variables whose content starts with () (which is why CVE-2014-6271 doesn't allow privilege escalation via sudo).

But again, that's for contexts where the environment cannot be trusted: any variable with any name and value can be set by a malicious user in that context. That doesn't apply to web servers/ssh or all the vectors that exploit CVE-2014-6271 where the environment is controlled (at least the name of the environment variables is controlled...)

It's important to block a variable like echo="() { evil; }", but not HTTP_FOO="() { evil; }", because HTTP_FOO is not going to be called as a command by any shell script or command line. And apache2 is never going to set an echo or BASH_ENV variable.

It's quite obvious some environment variables should be black-listed in some contexts based on their name, but nobody thought that they should be black-listed based on their content (except for sudo). Or in other words, nobody thought that arbitrary env vars could be a vector for code injection.

As to whether extensive testing when the feature was added could have caught it, I'd say it's unlikely.

When you test for the feature, you test for functionality. The functionality works fine. If you export the function in one bash invocation, it's imported alright in another. A very thorough testing could have spotted issues when both a variable and function with the same name are exported or when the function is imported in a locale different from the one it was exported in.

But to be able to spot the vulnerability, it's not a functionality test you would have had to do. The security aspect would have had to be the main focus, and you wouldn't be testing the functionality, but the mechanism and how it could be abused.

It's not something that developers (especially in 1989) often have at the back of their mind, and a shell developer could be excused to think his software is unlikely to be network exploitable.

Bash – Legacy Debian versions and Bash Shellshock

You have the option to just upgrade bash. To do so use the following apt-get command:

apt-get update

Then after the update fetches all of the available updates run:

apt-get install --only-upgrade bash

To get updates on older releases, Squeeze for example, you will probably need to add the Squeeze-LTS repo to your sources.list.

To add this repository, edit /etc/apt/sources.list and add the following line to the end of the file.

deb http://ftp.us.debian.org/debian squeeze-lts main non-free contrib

To check a particular system for the vulnerabilities (or see if the upgrade works) you can check the bash versions that you are using and see if the version is affected (it probably is) or there are numerous shell test scripts available on the web.

EDIT 1

To upgrade bash on Lenny or Etch, take a look at Ilya Sheershoff's answer below for how to compile bash from source and manually upgrade the version of bash that your release is using.

EDIT 2

Here is an example sources.list file from a Squeeze server I successfully upgraded:

deb http://ftp.us.debian.org/debian/ squeeze main
deb-src http://ftp.us.debian.org/debian/ squeeze main

deb http://security.debian.org/ squeeze/updates main
deb-src http://security.debian.org/ squeeze/updates main

# squeeze-updates, previously known as 'volatile'
deb http://ftp.us.debian.org/debian/ squeeze-updates main
deb-src http://ftp.us.debian.org/debian/ squeeze-updates main

# Other - Adding the lsb source for security updates
deb http://http.debian.net/debian/ squeeze-lts main contrib non-free
deb-src http://http.debian.net/debian/ squeeze-lts main contrib non-free