Linux – Understanding how mount namespaces work in Linux

filesystemslinuxmountnamespacevirtual-file-system

I am reading about mount namespaces and see:

in a mount namespace you can mount and unmount filesystems without it affecting the host filesystem. So you can have a totally different set of devices mounted (usually less).

I am trying to understand linux namespaces, and LXC and such, but I don't quite understand what that statement above means.

What I'm trying to understand is how a container (1) can have files like this:

/foo/a.txt
/foo/bar/b.txt

And another container (2) can have files like this:

/foo/a.txt
/foo/x.txt
/foo/bar/b.txt
/foo/bar/y.txt

Where /foo/a.txt and /foo/bar/b.txt on containers (1) and (2) are the same path, but perhaps they have different content:

# container (1)
cat /foo/a.txt #=> Hello from (1)

# container (2)
cat /foo/a.txt #=> Hello from (2)

This would mean that the files on the physical system (which I don't know anything about) are stored in one way, perhaps scattered all around. But then there is a centralized database of "virtual" files in the operating system, like this:

db:
  container1:
    foo:
      a.txt: Hello from a from (1)
      bar:
        b.txt: Hello from b from (1)
  container2:
    foo:
      a.txt: Hello from a from (2)
      x.txt: Hello from x from (2)
      bar:
        b.txt: Hello from b from (2)
        y.txt: Hello from y from (2)

Then there is another OS database for the physical files which might look like this:

drive1:
  dir1:
    foo:
      a.txt
      bar:
        b.txt
  dir2:
    foo:
      a.txt
      x.txt
      bar:
        b.txt
        y.txt

So when you create a file in the container, you are actually creating 2 new records:

  1. 1 for the drive-level physical files map
  2. 1 for the container-level virtual files map

This is how I imagine it to work. This is how I can see there being a way to (1) present the user (in an LXC container or cgroup (which I don't know much about)) with what feels like a complete "file system", in which they can (2) create their own fully-customizable directory structure (that may have the same named files/directories/paths as a completely different virtual file system), such that (3) the files from multiple different virtual file systems / containers don't override each other.

Wondering if this is how it works, or if not, how it actually works (or an outline of how it works).

Best Answer

mount namespaces differ in the arrangement of mounted filesystems.

This is very flexible, because mounts can be bind mounts of a sub-directory within a filesystem.

# unshare --mount  # run a shell in a new mount namespace

# mount --bind /usr/bin/ /mnt/
# ls /mnt/cp
/mnt/cp

# exit  # exit the shell, and hence the mount namespace

# ls /mnt/cp
ls: cannot access '/mnt/cp': No such file or directory

You can list your current set of mounts with the findmnt command.

In a full container, the root mount is replaced and you work with an entirely separate tree of mounts. This involves some extra details, such as the pivot_root() system call. You probably don't need to know exactly how to do that. Some details are available here: How to perform chroot with Linux namespaces?

Related Question