After reading about Linux namespaces I was under the impression that they are, amongst a lot of other features, an alternative to chroot. For example, in this article:
Other uses [of namespaces] include […] chroot()-style isolation of a process to a portion of the single directory hierarchy.
However, when I clone the mount namespace, for example with the following command, I still see the whole original root tree.
unshare --mount -- /bin/bash
I understand that I am now able to perform additional mounts in the new namespace that are not shared with the original namespace and thus this provides isolation, but it is still the same root, e.g. /etc
is still the same for both namespaces. Do I still need chroot
to change the root or is there an alternative?
I was expecting that this question would provide an answer, but the answer only uses chroot
, again.
EDIT #1
There was a now deleted comment that mentioned pivot_root
. Since this is actually part of linux/fs/namespace.c
, it is in fact part of the namespaces implementation. This suggests that changing the root directory only with unshare
and mount
is not possible, but namespaces provides an own – more clever – version of chroot
. Still I do not get the main idea of this approach that makes it fundamentally different from chroot
, even after reading the source code (in the sense of e.g. security or better isolation).
EDIT #2
This is not a duplicate of this question. After executing all the commands from the answer I have separate /tmp/tmp.vyM9IwnKuY (or similar), but the root directory is still the same!
Best Answer
Entering a mount namespace before setting up a
chroot
, lets you avoid cluttering the host namespace with additional mounts, e.g. for/proc
. You can usechroot
inside a mount namespace as a nice and simple hack.I think there are advantages to understanding
pivot_root
, but it has a bit of a learning curve. The documentation does not quite explain everything... although there is a usage example inman 8 pivot_root
(for the shell command).man 2 pivot_root
(for the system call) might be clearer if it did the same, and included an example C program.How to use pivot_root
Immediately after entering the mount namespace, you also need
mount --make-rslave /
or equivalent. Otherwise, all your mount changes propagate to the mounts in the original namespace, including thepivot_root
. You don't want that :).If you used the
unshare --mount
command, note it is documented to applymount --make-rprivate
by default. AFAICS this is a bad default and you don't want this in production code. E.g. at this point, it would stopeject
from working on a mounted DVD or USB in the host namespace. The DVD or USB would remain mounted inside the private mount tree, and the kernel would not let you eject the DVD.Once you've done that, you can mount e.g. the
/proc
directory you will be using. The same way you would forchroot
.Unlike when you use
chroot
,pivot_root
requires that your new root filesystem is a mount point. If it is not one already, you can satisfy this by simply applying a bind mount:mount --rbind new_root new_root
.Use
pivot_root
- and thenumount
the old root filesystem, with the-l
/MNT_DETACH
option. (You don't needumount -R
, which can take longer.).Technically, using
pivot_root
generally needs to involve usingchroot
as well; it's not "either-or".As per
man 2 pivot_root
, it's only defined as swapping the root of the mount namespace. It isn't defined to change which physical directory the process root is pointing to. Or the current working directory (/proc/self/cwd
). It happens that it does do so, but this is a hack to handle kernel threads. The manpage says that could change in future.Usually you want this sequence:
The postition of the
chroot
in this sequence is yet another subtle detail. Although the point ofpivot_root
is to rearrange the mount namespace, the kernel code seems to find the root filesystem to move by looking at the per-process root, which is whatchroot
sets.Why to use pivot_root
In principle, it makes sense to use
pivot_root
for security and isolation. I like to think about the theory of capability-based security. You pass in a list of the specific resources needed, and the process can access no other resources. In this case we are talking about the filesystems passed in to a mount namespace. This idea applies generally to the Linux "namespaces" feature, though I'm probably not expressing it very well.chroot
only sets the process root, but the process still refers to the full mount namespace. If a process retains the privilege to performchroot
, then it can traverse back up the filesystem namespace. As detailed inman 2 chroot
, "the superuser can escape from a 'chroot jail' by...".Another thought-provoking way to undo
chroot
isnsenter --mount=/proc/self/ns/mnt
. This is perhaps a stronger argument for the principle.nsenter
/setns()
necessarily re-loads the process root, from the root of the mount namespace... although the fact that this works when the two refer to different physical directories, might be considered a kernel bug. (Technical note: there could be multiple filesystems mounted on top of each other at the root;setns()
uses the top, most recently mounted one).This illustrates one advantage of combining a mount namespace with a "PID namespace". Being inside a PID namespace would prevent you from entering the mount namespace of an unconfined process. It also prevents you entering the root of an unconfined process (
/proc/$PID/root
). And of course a PID namespace also prevents you from killing any process which is outside it :-).