Linux Namespace – Why Bind Mount is Visible Outside Its Mount Namespace

bind-mountlinuxnamespace

So I'm trying to get a handle on how Linux's mount namespace works. So, I did a little experiment and opened up two terminals and ran the following:

Terminal 1

root@goliath:~# mkdir a b
root@goliath:~# touch a/foo.txt
root@goliath:~# unshare --mount -- /bin/bash
root@goliath:~# mount --bind a b
root@goliath:~# ls b
foo.txt

Terminal 2

root@goliath:~# ls b
foo.txt

How come the mount is visible in Terminal 2? Since it is not part of the mount namespace I expected the directory to appear empty here. I also tried passing -o shared=no and using --make-private options with mount, but I got the same result.

What am I missing and how can I make it actually private?

Best Answer

If you are on a systemd-based distribution with a util-linux version less than 2.27, you will see this unintuitive behavior. This is because CLONE_NEWNS propogates flags such as shared depending on a setting in the kernel. This setting is normally private, but systemd changes this to shared. As of util-linux 2.27, a patch was made that changes the default behaviour of the unshare command to use private as the default propagation behaviour as to be more intuitive.

Solution

If you are on a systemd system with util-linux prior to version 2.27, you must remount the root filesystem after running the unshare command:

# unshare --mount -- /bin/bash
# mount --make-private -o remount /

If you are on a systemd system with util-linux version 2.27 or later, it should work as expected in the example you gave in your question, verbatim, without the need to remount. If not, pass --propagation private to the unshare command to force the propagation of the mount namespace to be private.