Network Namespaces – How to Connect Veth Device Inside and Outside

network-namespacesnetworking

I have a process that has called unshare to create a new network namespace with just itself inside. When it calls execve to launch bash, the ip command shows that I have just an lo device. If I also create a user namespace and arrange for my process to be root inside the namespace, I can use the ip command to bring that device up and it works.

I can also use the ip command to create a veth device in this namespace. But it doesn't show up in ip netns list and the new veth device doesn't show up in the root level namespace (as I'd expect). How do I connect a veth device in the root-level namespace to my new veth device inside my process namespace? The ip command seems to require that the namespace has a name assigned by the ip command, and mine doesn't because I didn't use ip netns add to create it.

Maybe I could do it by writing my own program that used the netlink device and set things up. But I'd really prefer not to. Is there a way to do this through the command line?

There must be a way to do it, because docker containers have their own network namespace as well, and that namespace is also unnamed. Yet there is a veth device inside it that's connected to a veth device outside it.

My goal is to dynamically create a process isolation context, ideally without needing to become root outside the container. To this end I'm going to be creating a PID namespace, a UID namespace, a network namespace, an IPC namespace, and mount namespace. I may also create a cgroup namespace, but those are newish and I need to be able to run on currently supported versions of SLES, RHEL, and Ubuntu LTS.

I've been working through this one namespace at a time, and I currently have User, PID and mount namespaces working satisfactorily.

I can mount /proc/pid/ns/net if I must, but I would prefer to do that from inside the user namespace so (again) I don't have to be root outside the namespace. Mostly, I want everything to disappear as soon as all the processes in the namespace are gone. Having a bunch of state to clean up on the filesystem when I'm done would be less than ideal. Though creating it temporarily when the container is first allocated and then immediately removing it is far better than having to clean it up when the container exits.

No, I can't use docker, lxc, rkt, or any other existing solution such that I'd be relying on anything other than bog-standard system utilities (like ip), system libraries like glibc, and Linux system calls.

Best Answer

ip link has a namespace option, which in addition to a network namespace name, can use a PID to refer a process' namespace. If PID namespaces are shared between the processes, you can move devices either way; it is probably easiest from inside, when you consider PID 1 being "outside". With separate PID namespaces you need to move from outer (PID) namespace to the inner one.

For example, from inside of a network namespace you can create a veth device pair to PID 1 namespace:

ip link add veth0 type veth peer name veth0 netns 1

How namespaces work in Linux

Every process has reference files for their namespaces in /proc/<pid>/ns/. Additionally, ip netns creates persistent reference files in /run/netns/. These files are used with setns system call to change the namespace of the running thread to a namespace pointed by such file.

From shell you can enter to another namespace using nsenter program, providing namespace files (paths) in arguments.

A good overview of Linux namespaces is given in the Namespaces in operation article series on LWN.net.

Setting up namespaces

When you set up multiple namespaces (mount, pid, user, etc.), set up network namespace as early as possible, before altering mount and pid namespaces. If you do not have shared mount or pid namespaces, you do not have any way to point to the network namespace outside, because you can not see the files referring to network namespaces outside.

If you need more flexibility than the command line utilities provide, you need to use the systemcalls to manage name spaces directly from your program. For documentation, see the relevant man pages: man 2 setns, man 2 unshare and man 7 namespaces.

Related Question