Why doesn’t unsetenv() modify /proc/pid/environ

environment-variablesglibc

I was just looking at this question and wrote a noddy program to demonstrate unsetenv() modifying /proc/pid/environ. To my surprise it has no effect!

Here's what I did:

#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>

int main(void)
{
  printf("pid=%d\n", getpid());
  printf("sleeping 10...\n");
  sleep(10);
  printf("unsetenv result: %d\n", unsetenv("WIBBLE"));
  printf("unset; sleeping 10 more...\n");
  sleep(10);

  return 0;
}

However, when I run

WIBBLE=hello ./test_program

then I see WIBBLE in the environment both before and after the unsetenv() runs:

# before the unsetenv()
$ tr '\0' '\n' < /proc/498/environ | grep WIBBLE
WIBBLE=hello
# after the unsetenv()
$ tr '\0' '\n' < /proc/498/environ | grep WIBBLE
WIBBLE=hello

Why doesn't unsetenv() modify /proc/pid/environ?

Best Answer

When a program starts, it receives its environment as an array of pointers to some strings in the format var=value. On Linux, those are located at the bottom of the stack. At the very bottom, you have all the strings tucked one after the other (that's what's shown in /proc/pid/environ). And above you have an array of pointers (NULL terminated) to those strings (that's what goes into char *envp[] in your int main(int argc, char* argv[], char* envp[]), and the libc would generally initialise environ to).

putenv()/setenv()/unsetenv(), do not modify those strings, they don't generally even modify the pointers. On some systems, those (strings and pointers) are read-only.

While the libc will generally initialise char **environ to the address of the first pointer above, any modification of the environment (and those are for future execs), will generally cause a new array of pointers to be created and assigned to environ.

If environ is initially [a,b,c,d,NULL], where a is a pointer to x=1, b to y=2, c to z=3, d to q=5, if you do a unsetenv("y"), environ would have to become [a,c,d,NULL]. On systems where the initial array list is read-only, a new list would have to be allocated and assigned to environ and [a,c,d,NULL] stored in there. Upon the next unsetenv(), the list could be modified in place. Only if you did unsetenv("x") above could a list not be reallocated (environ could just be incremented to point to &envp[1]. I don't know if some libc implementations actually perform that optimisation).

In anycase, there's no reason for the strings themselves stored at the bottom of the stack to be modified in any way. Even if an unsetenv() implementation was actually modifying the data initially received on the stack in-place, it would only modify the pointers, it wouldn't go all the trouble of also erasing the strings they point to. (that seems to be what the GNU libc does on Linux systems (with ELF executables at least), it does modify the list of pointers at envp in place as long as the number of environment variables doesn't increase.

You can observe the behaviour using a program like:

#include <sys/types.h>
#include <unistd.h>
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
extern char **environ;
int main(int argc, char* argv[], char* envp[]) {
  char cmd[128];
  int i;

  printf("envp: %p environ: %p\n", envp, environ);
  for (i = 0; envp[i]; i++)
    printf("  envp[%d]: %p (%s)\n", i, envp[i], envp[i]);

#define DO(x) x; puts("\nAfter " #x "\n"); \
  printf("envp: %p environ: %p\n", envp, environ); \
  for (i = 0; environ[i]; i++) \
    printf("  environ[%d]: %p (%s)\n", i, environ[i], environ[i])

  DO(unsetenv("a"));
  DO(setenv("b", "xxx", 1));
  DO(setenv("c", "xxx", 1));

  puts("\nAddress of heap and stack:");
  sprintf(cmd, "grep -e stack -e heap /proc/%u/maps", getpid());

  fflush(stdout);
  system(cmd);
}

On Linux with the GNU libc (same with klibc, musl libc or dietlibc except for the fact that they use mmapped anonymous memory instead of the heap for allocated memory), when run as env -i a=1 x=3 ./e, that gives (comments inline):

envp: 0x7ffc2e7b3238 environ: 0x7ffc2e7b3238
  envp[0]: 0x7ffc2e7b4fec (a=1)
  envp[1]: 0x7ffc2e7b4ff0 (x=3)
   # envp[1] is almost at the bottom of the stack. I lied above in that
   # there are more things like the path of the executable
   # environ initially points to the same pointer list as envp

After unsetenv("a")

envp: 0x7ffc2e7b3238 environ: 0x7ffc2e7b3238
  environ[0]: 0x7ffc2e7b4ff0 (x=3)
   # here, unsetenv has reused the envp[] list and has not allocated a new
   # list. It has shifted the pointers though and not done the optimisation
   # I mention above

After setenv("b", "xxx", 1)

envp: 0x7ffc2e7b3238 environ: 0x1bb3420
  environ[0]: 0x7ffc2e7b4ff0 (x=3)
  environ[1]: 0x1bb3440 (b=xxx)
   # a new list has been allocated on the heap. (it could have reused the
   # slot freed by unsetenv() above but didn't, Solaris' version does).
   # the "b=xxx" string is also allocated on the heap.

After setenv("c", "xxx", 1)

envp: 0x7ffc2e7b3238 environ: 0x1bb3490
  environ[0]: 0x7ffc2e7b4ff0 (x=3)
  environ[1]: 0x1bb3440 (b=xxx)
  environ[2]: 0x1bb3420 (c=xxx)

Address of heap and stack:
01bb3000-01bd4000 rw-p 00000000 00:00                              [heap]
7ffc2e794000-7ffc2e7b5000 rw-p 00000000 00:00 0                    [stack]

On FreeBSD (11-rc1 here), a new list is allocated already upon unsetenv(). Not only that, but the strings themselves are being copied onto the heap as well so environ is completely disconnected from the envp[] that the program received on start-up after the first modification of the environment:

envp: 0x7fffffffedd8 environ: 0x7fffffffedd8
  envp[0]: 0x7fffffffef74 (x=2)
  envp[1]: 0x7fffffffef78 (a=1)

After unsetenv("a")

envp: 0x7fffffffedd8 environ: 0x800e24000
  environ[0]: 0x800e15008 (x=2)

After setenv("b", "xxx", 1)

envp: 0x7fffffffedd8 environ: 0x800e24000
  environ[0]: 0x800e15018 (b=xxx)
  environ[1]: 0x800e15008 (x=2)

After setenv("c", "xxx", 1)

envp: 0x7fffffffedd8 environ: 0x800e24000
  environ[0]: 0x800e15020 (c=xxx)
  environ[1]: 0x800e15018 (b=xxx)
  environ[2]: 0x800e15008 (x=2)

On Solaris (11 here), we see the optimisation mentioned above (where unsetenv("a") ends up being done with a environ++), the slot freed by unsetenv() being reused for b, but of course a new list of pointers has to be allocated upon the insertion of a new environment variable (c):

envp: 0xfeffef6c environ: 0xfeffef6c
  envp[0]: 0xfeffefec (a=1)
  envp[1]: 0xfeffeff0 (x=2)

After unsetenv("a")

envp: 0xfeffef6c environ: 0xfeffef70
  environ[0]: 0xfeffeff0 (x=2)

After setenv("b", "xxx", 1)

envp: 0xfeffef6c environ: 0xfeffef6c
  environ[0]: 0x806145c (b=xxx)
  environ[1]: 0xfeffeff0 (x=2)

After setenv("c", "xxx", 1)

envp: 0xfeffef6c environ: 0x8061c48
  environ[0]: 0x8061474 (c=xxx)
  environ[1]: 0x806145c (b=xxx)
  environ[2]: 0xfeffeff0 (x=2)

Related Solutions

Shell – Why $SHELL doesn’t change when I run new shell

You shouldn't expect this variable to change. It is used to store the path to your default shell, i.e. the one stored in the password database, not which shell you're currently running.

Linux Kernel – How to Change /proc/PID/environ After Process Start

On Linux, you can overwrite the value of the environment strings on the stack.

So you can hide the entry by overwriting it with zeros or anything else:

#include <sys/types.h>
#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main(int argc, char* argv[], char* envp[]) {
  char cmd[100];

  while (*envp) {
    if (strncmp(*envp, "k=", 2) == 0)
      memset(*envp, 0, strlen(*envp));

    envp++;
  }

  sprintf(cmd, "cat /proc/%u/environ", getpid());

  system(cmd);
  return 0;
}

Run as:

$ env -i a=foo k=v b=bar ./wipe-env | hd
00000000  61 3d 66 6f 6f 00 00 00  00 00 62 3d 62 61 72 00  |a=foo.....b=bar.|
00000010

the k=v has been overwritten with \0\0\0.

Note that setenv("k", "", 1) to overwrite the value won't work as in that case, a new "k=" string is allocated.

If you've not otherwise modified the k environment variable with setenv()/putenv(), then you should also be able to do something like this to get the address of the k=v string on the stack (well, of one of them):

#include <sys/types.h>
#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>


int main(int argc, char* argv[]) {
  char cmd[100];
  char *e = getenv("k");

  if (e) {
    e -= strlen("k=");
    memset(e, 0, strlen(e));
  }

  sprintf(cmd, "cat /proc/%u/environ", getpid());

  system(cmd);
  return 0;
}

Note however that it removes only one of the k=v entries received in the environment. Usually, there is only one, but nothing is stopping anyone from passing both k=v1 and k=v2 (or k=v twice) in the env list passed to execve(). That has been the cause of security vulnerabilities in the past such as CVE-2016-2381. It could genuinely happen with bash prior to shellshock when exporting both a variable and function by the same name.

In any case, there will always be a small window during which the env var string has not been overridden yet, so you may want to find another way to pass the secret information to the command (like a pipe for instance) if exposing it via /proc/pid/environ is a concern.

Also note that contrary to /proc/pid/cmdline, /proc/pid/environment is only accessible by processes with the same euid or root (or root only if the euid and ruid of the process are not the same it would seem).

You can hide that value from them in /proc/pid/environ, but they may still be able to get any other copy you've made of the string in memory, for instance by attaching a debugger to it.

See https://www.kernel.org/doc/Documentation/security/Yama.txt for ways to prevent at least non-root users from doing that.

Best Answer

Related Solutions

Shell – Why $SHELL doesn’t change when I run new shell

Linux Kernel – How to Change /proc/PID/environ After Process Start

Related Question