Network Namespaces – How to Connect Veth Device Inside and Outside

network-namespacesnetworking

I have a process that has called unshare to create a new network namespace with just itself inside. When it calls execve to launch bash, the ip command shows that I have just an lo device. If I also create a user namespace and arrange for my process to be root inside the namespace, I can use the ip command to bring that device up and it works.

I can also use the ip command to create a veth device in this namespace. But it doesn't show up in ip netns list and the new veth device doesn't show up in the root level namespace (as I'd expect). How do I connect a veth device in the root-level namespace to my new veth device inside my process namespace? The ip command seems to require that the namespace has a name assigned by the ip command, and mine doesn't because I didn't use ip netns add to create it.

Maybe I could do it by writing my own program that used the netlink device and set things up. But I'd really prefer not to. Is there a way to do this through the command line?

There must be a way to do it, because docker containers have their own network namespace as well, and that namespace is also unnamed. Yet there is a veth device inside it that's connected to a veth device outside it.

My goal is to dynamically create a process isolation context, ideally without needing to become root outside the container. To this end I'm going to be creating a PID namespace, a UID namespace, a network namespace, an IPC namespace, and mount namespace. I may also create a cgroup namespace, but those are newish and I need to be able to run on currently supported versions of SLES, RHEL, and Ubuntu LTS.

I've been working through this one namespace at a time, and I currently have User, PID and mount namespaces working satisfactorily.

I can mount /proc/pid/ns/net if I must, but I would prefer to do that from inside the user namespace so (again) I don't have to be root outside the namespace. Mostly, I want everything to disappear as soon as all the processes in the namespace are gone. Having a bunch of state to clean up on the filesystem when I'm done would be less than ideal. Though creating it temporarily when the container is first allocated and then immediately removing it is far better than having to clean it up when the container exits.

No, I can't use docker, lxc, rkt, or any other existing solution such that I'd be relying on anything other than bog-standard system utilities (like ip), system libraries like glibc, and Linux system calls.

Best Answer

ip link has a namespace option, which in addition to a network namespace name, can use a PID to refer a process' namespace. If PID namespaces are shared between the processes, you can move devices either way; it is probably easiest from inside, when you consider PID 1 being "outside". With separate PID namespaces you need to move from outer (PID) namespace to the inner one.

For example, from inside of a network namespace you can create a veth device pair to PID 1 namespace:

ip link add veth0 type veth peer name veth0 netns 1

How namespaces work in Linux

Every process has reference files for their namespaces in /proc/<pid>/ns/. Additionally, ip netns creates persistent reference files in /run/netns/. These files are used with setns system call to change the namespace of the running thread to a namespace pointed by such file.

From shell you can enter to another namespace using nsenter program, providing namespace files (paths) in arguments.

A good overview of Linux namespaces is given in the Namespaces in operation article series on LWN.net.

Setting up namespaces

When you set up multiple namespaces (mount, pid, user, etc.), set up network namespace as early as possible, before altering mount and pid namespaces. If you do not have shared mount or pid namespaces, you do not have any way to point to the network namespace outside, because you can not see the files referring to network namespaces outside.

If you need more flexibility than the command line utilities provide, you need to use the systemcalls to manage name spaces directly from your program. For documentation, see the relevant man pages: man 2 setns, man 2 unshare and man 7 namespaces.

Related Solutions

Linux namespace, How to connect internet in network namespace

Think of a network namespace as another computer. Think of a veth pair as two Ethernet cards with a crossover cable between them.

There are three main ways to connect a network namespace to the Internet, NAT, conventional IP routing, Ethernet bridging.

NAT is generally the easiest to set up because it works with any type of upstream internet connection and doesn't require the cooperation of the upstream network.

For NAT to work several things need to be in place.

The default gateway needs to be set up in the secondary network namespace (you appear to have done this)
IP forwarding needs to be enabled in the main network namespace (you have not shown how this setting is set).
The iptables rules in the main network namespace need to allow the traffic to pass (on the kernel side this is ok by default but some firewall software may have set up rules that block forwarding).
An appropriate SNAT or masqurade rule needs to be in place in the main network namespace (you appear to have done this).

You also need to ensure that an appropriate /etc/resolv.conf is available for programs in the secondary network namespace. Remember that even if you bring up the local loopback interface in the secondary network namespace (which you should do) it is still local to each network namespace.

It is best to ping/traceroute by IP address when initially setting up networks to seperate name resoloution issues from general connectivity issues.

D-Bus Network – Connect with D-Bus in a Network Namespace

Connecting to a DBus daemon listening on an abstract Unix socket in a different network namespace is not possible. Such addresses can be identified in ss -x via an address that contains a @:

u_str  ESTAB      0      0      @/tmp/dbus-t00hzZWBDm 11204746              * 11210618

As a workaround, you can create a non-abstract Unix or IP socket which proxies to the abstract Unix socket. This is to be done outside the network namespace. From within the network namespace, you can then connect to that address. E.g. assuming the above abstract socket address, run this outside the namespace:

socat UNIX-LISTEN:/tmp/whatever,fork ABSTRACT-CONNECT:/tmp/dbus-t00hzZWBDm

Then from within the namespace you can connect by setting this environment variable:

DBUS_SESSION_BUS_ADDRESS=unix:path=/tmp/whatever