linux – What is the NSFS Filesystem


The kernel contains a filesystem, nsfs. snapd creates a nsfs mount under /run/snapd/ns/<snapname>.mnt for each installed snap. ls shows it as a 0 byte file.

The kernel source code does not seem to contain any documentation or comments about it. The main implementation seems to be here and the header file here.

From that, it seems to be namespace related.

A search of the repo does not even find Kconfig entries to enable or disable it…

What is the purpose of this filesystem and what is used for?

Best Answer

As described in the kernel commit log linked to by jiliagre above, the nsfs filesystem is a virtual filesystem making Linux-kernel namespaces available. It is separate from the /proc "proc" filesystem, where some process directory entries reference inodes in the nsfs filesystem in order to show which namespaces a certain process (or thread) is currently using.

The nsfs doesn't get listed in /proc/filesystems (while proc does), so it cannot be explicitly mounted. mount -t nsfs ./namespaces fails with "unknown filesystem type". This is, as nsfs as it is tightly interwoven with the proc filesystem.

The filesystem type nsfs only becomes visible via /proc/$PID/mountinfo when bind-mounting an existing(!) namespace filesystem link to another target. As Stephen Kitt rightly suggests above, this is to keep namespaces existing even if no process is using them anymore.

For example, create a new user namespace with a new network namespace, then bind-mount it, then exit: the namespace still exists, but lsns won't find it, since it's not listed in /proc/$PID/ns anymore, but exists as a (bind) mount point.

# bind mount only needs an inode, not necessarily a directory ;)
touch mynetns
# create new network namespace, show its id and then bind-mount it, so it
# is kept existing after the unshare'd bash has terminated.
# output: net:[##########]
NS=$(sudo unshare -n bash -c "readlink /proc/self/ns/net && mount --bind /proc/self/ns/net mynetns") && echo $NS
# notice how lsns cannot see this namespace anymore: no match!
lsns -t net | grep ${NS:5:-1} || echo "lsns: no match for net:[${NS:5:-1}]"
# however, findmnt does locate it on the nsfs...
findmnt -t nsfs | grep ${NS:5:-1} || echo "no match for net:[${NS:5:-1}]"
# output: /home/.../mynetns nsfs[net:[##########]] nsfs rw
# let the namespace go...
echo "unbinding + releasing network namespace"
sudo umount mynetns
findmnt -t nsfs | grep ${NS:5:-1} || echo "findmnt: no match for net:[${NS:5:-1}]"
# clean up
rm mynetns

Output should be similar to this one:

lsns: no match for net:[4026532992]
/home/.../mynetns nsfs[net:[4026532992]] nsfs   rw
unbinding + releasing network namespace
findmnt: no match for net:[4026532992]

Please note that it is not possible to create namespaces via the nsfs filesystem, only via the syscalls clone() (CLONE_NEW...) and unshare. The nsfs only reflects the current kernel status w.r.t. namespaces, but it cannot create or destroy them.

Namespaces automatically get destroyed whenever there isn't any reference to them left, no processes (so no /proc/$PID/ns/...) AND no bind-mounts either, as we've explored in the above example.

Related Question