How Was the Shellshock Bash Vulnerability Found?

bashbugsshellshockvulnerability

Since this bug affects so many platforms, we might learn something from the process by which this vulnerability was found: was it an εὕρηκα (eureka) moment or the result of a security check?

Since we know Stéphane found the Shellshock bug, and others may know the process as well, we would be interested in the story of how he came to find the bug.

Best Answer

To reassure a few, I didn't find the bug by observing exploits, I have no reason to believe it's been exploited before being disclosed (though of course I can't rule it out). I did not find it by looking at bash's code either.

I can't say I remember exactly my train of thoughts at the time.

That more or less came from some reflection on some behaviours of some software I find dangerous (the behaviours, not the software). The kind of behaviour that makes you think: that doesn't sound like a good idea.

In this case, I was reflecting on the common configuration of ssh that allows passing environment variables unsanitised from the client provided their name starts with LC_. The idea is so that people can keep using their own language when sshing into other machines. A good idea until you start to consider how complex localisation handling is especially when UTF-8 is brought into the equation (and seeing how badly it's handled by many applications).

Back in July 2014, I had already reported a vulnerability in glibc localisation handling which combined with that sshd config, and two other dangerous behaviours of the bash shell allowed (authenticated) attackers to hack into git servers provided they were able to upload files there and bash was used as the login shell of the git unix user (CVE-2014-0475).

I was thinking it was probably a bad idea to use bash as the login shell of users offering services over ssh, given that it's quite a complex shell (when all you need is just parsing a very simple command line) and has inherited most of the misdesigns of ksh. Since I had already identified a few problems with bash being used in that context (to interpret ssh ForceCommands), I was wondering if there were potentially more there.

AcceptEnv LC_* allows any variable whose name starts with LC_ and I had the vague recollection that bash exported functions (a dangerous albeit at time useful feature) were using environment variables whose name was something like myfunction() and was wondering if there was not something interesting to look at there.

I was about to dismiss it on the ground that the worst thing one could do would be to redefine a command called LC_something which could not really be a problem as those are not existing command names, but then I started to wonder how bash imported those environment variables.

What if the variables were called LC_foo;echo test; f() for instance? So I decided to have a closer look.

A:

$ env -i bash -c 'zzz() { :;}; export -f zzz; env'
[...]
zzz=() {  :
}

revealed that my recollection was wrong in that the variables were not called myfunction() but myfunction (and it's the value that starts with ()).

And a quick test:

$ env 'true;echo test; f=() { :;}' bash -c :
test
bash: error importing function definition for `true;echo test; f'

confirmed my suspicion that the variable name was not sanitized, and the code was evaluated upon startup.

Worse, a lot worse, the value was not sanitized either:

$ env 'foo=() { :;}; echo test' bash -c :
test

That meant that any environment variable could be a vector.

That's when I realised the extent of the problem, confirmed that it was exploitable over HTTP as well (HTTP_xxx/QUERYSTRING... env vars), other ones like mail processing services, later DHCP (and probably a long list) and reported it (carefully).

Related Question