What’s special about “!xxx%s%s%s%s%s%s%s%s”

cshhistorical-unixhistory

I was linked to The Unix-Haters Handbook and stumbled on (page 149):

Subject: Relevant Unix bug

October 11, 1991

Fellow W4115x students—

While we’re on the subject of activation records, argument passing, and calling conventions, did you know that typing:

!xxx%s%s%s%s%s%s%s%s

to any C-shell will cause it to crash immediately? Do you know why?

Questions to think about:

  • What does the shell do when you type !xxx?
  • What must it be doing with your input when you type
    !xxx%s%s%s%s%s%s%s%s?
  • Why does this crash the shell?
  • How could you (rather easily) rewrite the offending part of the shell
    so as not to have this problem?

Purely out of curiosity, can anyone explain what was the problem? Unsurprisingly, searching Google for the string doesn't help. Searching for other quotes from the message only gave me other copies of the message but no explanation.

Best Answer

I don't feel like digging for the sources of 25-year old shells, but

It could be a format-string vulnerability.

If the shell contains code like

printf(str);

where str is some string taken from the user's input, the contents of the string will be the format string that printf uses. The %s's tell printf to print a string pointed to by an argument. If the arguments aren't given (as above, there's only the format string), the function will read some other data from the stack, and follow them as pointers. Probably accessing unmapped memory and crashing the process.

In a way, I think the wording of the message hints at a solution like this, too. If you type !xxx, what the shell visibly does is print an error message like !xxx: event not found. From there, it's not a big leap to trying to also print !xxx%s%s%s%s%s%s%s%s: event not found, with the implication of a format string vulnerability.


I shouldn't have, but I took a peek at the source here (4.3BSD-Tahoe/usr/src/bin/csh, dates are from 1988).

findev(cp, anyarg) in sh.lex.c looks it might be the function to find a matching history event: it walks through a linked list of struct Hist called Histlist. If it doesn't find anything, it calls seterr2(cp, ": Event not found"); through noev(). cp here looks to be the string being searched for in the history.

seterr2() sets the variable err as concatenation of the arguments, and err is used as if (err) error(err); in a couple of places in process(), in sh.c. Finally, error() (in sh.err.c) contains a classic format string vulnerability: if (s) printf(s, arg), printf(".\n");

In some other places, error() is called with an argument, like error("Unknown user: %s", gpath + 1);, so clearly the idea is that the first argument to error() may be a format string.

I wouldn't be honest if I said I understood the history substitution functions in full. It's pretty much uncommented manual string handling in C. % does have a special meaning in history substitution, but I can only see it being handled specially as the first character (as in !%) or after findev() is called.

Related Question