Since when do the POSIX and GNU rm not delete /

gnuhistoryposixrm

For several years now, the GNU rm utility won't delete / unless it is called with the --no-preserve-root option. However, the command rm -rf / has been lodged in the collective subconscious as dangerous for a very long time and people still often cite it as a "scary" command.

I was wondering when this rule that rm cannot delete / first appeared. I checked the POSIX specs, and I can see that while POSIX:2008 includes this safety feature, POSIX:2001 does not. Since the online versions of the POSIX specs are updated from time to time, with each new sub-release, I also checked the wayback machine and found the relevant page of POSIX:2008 from 2010 and was able to confirm that the rule that rm cannot remove / was already listed then.

So, my questions are:

When was the rule that rm cannot remove / added to the POSIX specs? Was it in the original 2008 edition of the Single UNIX Specification version 4 or was it added in a revision?
When was this limitation added to GNU rm? I'm pretty sure it was before it was added to POSIX, but when did it happen?

Best Answer

You can find the HTML version of all the editions of POSIX 2008 online:

original: http://pubs.opengroup.org/onlinepubs/9699919799.2008edition/utilities/rm.html
TC1 (2013 edition) http://pubs.opengroup.org/onlinepubs/9699919799.2013edition/utilities/rm.html
TC2 (2016 edition) http://pubs.opengroup.org/onlinepubs/9699919799.2016edition/utilities/rm.html

That was added in the 2008 edition.

Technical corrigenda generally don't add new features.

You can see the previous version (http://pubs.opengroup.org/onlinepubs/009695399/utilities/rm.html) (POSIX 2004) didn't have that text.

The new text was accepted in the 2003-05-09 austin group conference for inclusion in a later revision of the standard.

It was requested by John Beck of Sun Microsystems in March that same year (link requires opengroup registration, see also Enhancement Request Number 5 here).

John Beck wrote, on Tue 11 Mar 2003:

@ page 820 line 31681-31683 section rm comment {JTB-1}

Problem:

Defect code :  3. Clarification required

An occasional user mistake, with devastating consequences, is to
write a shell script with a line such as:
      rm -rf $VARIABLE1/$VARIABLE2
or
      rm -rf /$VARIABLE1
without verifying that either variable is set, which can lead to
      rm -rf /
being the resulting command.  Since there is no plausible
circumstance under which this is the desired behavior, it seems
reasonable to disallow this.  Such a safeguard would, however,
violate the current specification.

Action:

Either extend the exceptions for . and .. on the noted lines
to list / as well, or specify that the behavior of rm if an
operand resolves to / is undefined.

GNU rm added --preserve-root and --no-preserve-root options in this 2003-11-09 commit, but --preserve-root only became the default in this 2006-09-03 commit, so in coreutils 6.2

FreeBSD has been preserving slash since that 2004-10-04 commit (with a "Find out how flame-proof my underwear really is" commit log), but initially not when under POSIXLY_CORRECT, until they remembered to check a decade later that POSIX was now mandating it at which point it was done also in POSIX mode.

The FreeBSD initial commit mentions Solaris was already doing it at that time.

@JdePB (in comment below) found that link to a Sun insider story corroborating and giving more details on the Solaris origin and suggesting Solaris already had the safeguard in place before they made the request to the Austin group.

It explains the rationale for adding that exclusion. While one can only blame oneself if they do rm -rf /, there's a case where a script could do it if doing rm -rf -- "$1/$2" without checking that $1/$2 were provided which is the thing that hit some Sun customers bad when misapplying a Solaris patch (according to that link).

The forbidding of deletion of . and .. was added long before that and again to safeguard against potential mishaps. rm still is a dangerous command. It does what it's meant to do: remove what you tell it to.

rm -rf /*
cd /tmp &&  rm -rf .*/   # on some systems where rm -rf ../ still removes
                         # the content of ../ and shells that still
                         # may include . and .. in glob expansions.
rm -rf -- "$diretcory"/* # note the misspelled variable name
dir='foo '; rm -rf $dir/*

Would also remove everything. Shell filename completion has been known to cause such problems when you do

rm -rf someth<Tab>/*

Expanded to:

rm -rf something /*

Because something so happened not to be a directory.

Shells like tcsh or zsh will add an extra prompt when trying to call rm with a * wildcard (tcsh not by default).

Related Solutions

permissions – Unable to Delete File Even When Running as Root

Check the permissions of the directory. To delete a file inside it, it should be writable by you

chmod ugo+w .

and not immutable or append-only:

chattr -i -a .

Check with ls -la and lsattr -a.

Is the historical Unix V5 tr command padding behavior of set2 different from what we consider today “classic” System V (1983-1988) behavior

The difference is only in the wording of the padding behavior in the V4-V5 manual - but the behavior is the same throughout. As it stands the results of the V5 implementation is identical to that of the System V one, which is itself identical to the GNU tr behavior with the --truncate-set1 option. Furthermore, "truncating set1 to the lenght of set2" gives the same result as "padding string2 with corresponding characters from string1". It means the same thing in practice. Let's demonstrate this.

First, you need not be a developer to try to compile this. Compare the source code with the almost identical PWB/Unix version. You will see the only difference being reliance on the "modern" stdio.h assets basically, so I've stripped the source of its references to inbuf, fout, dup and flush and replaced it with what PWB/Unix does - but this in no way should alter the behavior as the algorithms remain untouched. I've annotated the trivial changes I've made from the original:

#include <stdio.h>    <------ added
int dflag = 0;        <------ added "=" sign to those
int sflag = 0;
int cflag = 0;
int save = 0;
char code[256];
char squeez[256];
char vect[256];
struct string { int last, max, rep; char *p; } string1, string2;
FILE *input;          <------ part of the stdio framework I guess;

main(argc,argv)
char **argv;
{
    int i, j;
    int c, d;
    char *compl;

    string1.last = string2.last = 0;
    string1.max = string2.max = 0;
    string1.rep = string2.rep = 0;
    string1.p = string2.p = "";

    if(--argc>0) {
        argv++;
        if(*argv[0]=='-'&&argv[0][1]!=0) {
            while(*++argv[0])
                switch(*argv[0]) {
                case 'c':
                    cflag++;
                    continue;
                case 'd':
                    dflag++;
                    continue;
                case 's':
                    sflag++;
                    continue;
                }
            argc--;
            argv++;
        }
    }
    if(argc>0) string1.p = argv[0];
    if(argc>1) string2.p = argv[1];
    for(i=0; i<256; i++)
        code[i] = vect[i] = 0;
    if(cflag) {
        while(c = next(&string1))
            vect[c&0377] = 1;
        j = 0;
        for(i=1; i<256; i++)
            if(vect[i]==0) vect[j++] = i;
        vect[j] = 0;
        compl = vect;
    }
    for(i=0; i<256; i++)
        squeez[i] = 0;
    for(;;){
        if(cflag) c = *compl++;
        else c = next(&string1);
        if(c==0) break;
        d = next(&string2);
        if(d==0) d = c;
        code[c&0377] = d;
        squeez[d&0377] = 1;
    }
    while(d = next(&string2))
        squeez[d&0377] = 1;
    squeez[0] = 1;
    for(i=0;i<256;i++) {
        if(code[i]==0) code[i] = i;
        else if(dflag) code[i] = 0;
    }

    input = stdin;                     <------ again stdio
    while((c=getc(input)) != EOF ) {   <------
        if(c == 0) continue;
        if(c = code[c&0377]&0377)
            if(!sflag || c!=save || !squeez[c&0377])
                putchar(save = c);
    }

}

next(s)
struct string *s;
{
    int a, b, c, n;
    int base;

    if(--s->rep > 0) return(s->last);
    if(s->last < s->max) return(++s->last);
    if(*s->p=='[') {
        nextc(s);
        s->last = a = nextc(s);
        s->max = 0;
        switch(nextc(s)) {
        case '-':
            b = nextc(s);
            if(b<a || *s->p++!=']')
                goto error;
            s->max = b;
            return(a);
        case '*':
            base = (*s->p=='0')?8:10;
            n = 0;
            while((c = *s->p)>='0' && c<'0'+base) {
                n = base*n + c - '0';
                s->p++;
            }
            if(*s->p++!=']') goto error;
            if(n==0) n = 1000;
            s->rep = n;
            return(a);
        default:
        error:
            write(1,"Bad string\n",11);
            exit(0);     <------original was exit();
        }
    }
    return(nextc(s));
}

nextc(s)
struct string *s;
{
    int c, i, n;

    c = *s->p++;
    if(c=='\\') {
        i = n = 0;
        while(i<3 && (c = *s->p)>='0' && c<='7') {
            n = n*8 + c - '0';
            i++;
            s->p++;
        }
        if(i>0) c = n;
        else c = *s->p++;
    }
    if(c==0) *--s->p = 0;
    return(c&0377);
}

So cc tr.c compiles:

tr.c: In function ‘next’:
tr.c:118:4: warning: incompatible implicit declaration of built-in function ‘exit’ 
[enabled by default]
exit(0);
^

But a.out is there and works, so let's now compare the padding behavior of the two programs we have:

GNU tr

#tr 0123456789 d     
0123456789 input
dddddddddd output             <----- BSD classic behavior

#tr 0123456789 d123456789     <----- padding set2 with set1 explicitly 
0123456789 i
d123456789 o
01234567890123456789 i
d123456789d123456789 o

#tr -t 0123456789 d           <----- --truncate-set1 i.e. System V behavior
0123456789 i
d123456789 o                  <----- concretely, this is what is meant by a result 
0012 i                               where set2 was padded with set1
dd12 o

#tr -t 0123456789 d123456789  <----- padding set2 with set1 explicitly
0123456789 i                  
d123456789 o                  <----- note this is identical to the last results

Unix V5 tr + stdio mod

#./a.out 0123456789 d         <----- our compiled version with the classic example
0123456789 i
d123456789 o

./a.out 0123456789 d123456789 <----- padding set2 with set1 explicitly
0123456789 i
d123456789 o

So our V5 version behaves exactly like the System V version in that respect. Furthermore explicitly padding set2 with set1 yields the same result for all implementations because it insures that set1 and set2 have the same number of elements (and it's when you don't have this that results vary historically).

Finally, explicitly padding or having tr pad set2 with set1 as described in the original V4-V5 manuals means the same thing as truncating set1 to the length of set2 insofar as results are concerned - it IS the classic System V implementation for padding and yields the same results. V5 tr is not a different implementation, despite the difference in the man pages.

Best Answer

Related Solutions

permissions – Unable to Delete File Even When Running as Root

Is the historical Unix V5 tr command padding behavior of set2 different from what we consider today “classic” System V (1983-1988) behavior

Related Question