Task
I need to unambiguously and without "holistic" guessing find the peer network interface of a veth end in another network namespace.
Theory ./. Reality
Albeit a lot of documentation and also answers here on SO assume that the ifindex indices of network interfaces are globally unique per host across network namespaces, this doesn't hold in many cases: ifindex/iflink
are ambiguous. Even the loopback already shows the contrary, having an ifindex of 1 in any network namespace. Also, depending on the container environment, ifindex
numbers get reused in different namespaces. Which makes tracing veth wiring a nightmare, espcially with lots of containers and a host bridge with veth peers all ending in @if3 or so…
Example: link-netnsid
is 0
Spin up a Docker container instance, just to get a new veth
pair connecting from the host network namespace to the new container network namespace…
$ sudo docker run -it debian /bin/bash
Now, in the host network namespace list the network interfaces (I've left out those interfaces that are of no interest to this question):
$ ip link show 1: lo: mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 ... 4: docker0: mtu 1500 qdisc noqueue state UP mode DEFAULT group default link/ether 02:42:34:23:81:f0 brd ff:ff:ff:ff:ff:ff ... 16: vethfc8d91e@if15: mtu 1500 qdisc noqueue master docker0 state UP mode DEFAULT group default link/ether da:4c:f7:50:09:e2 brd ff:ff:ff:ff:ff:ff link-netnsid 0
As you can see, while the iflink
is unambiguous, but the link-netnsid
is 0, despite the peer end sitting in a different network namespace.
For reference, check the netnsid in the unnamed network namespace of the container:
$ sudo lsns -t net NS TYPE NPROCS PID USER COMMAND ... ... 4026532469 net 1 29616 root /bin/bash $ sudo nsenter -t 29616 -n ip link show 1: lo: mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 15: eth0@if16: mtu 1500 qdisc noqueue state UP mode DEFAULT group default link/ether 02:42:ac:11:00:02 brd ff:ff:ff:ff:ff:ff link-netnsid 0
So, for both veth ends ip link show
(and RTNETLINK fwif) tells us they're in the same network namespace with netnsid 0. Which is either wrong or correct under the assumptions that link-netnsids are local as opposed to global. I could not find any documentation that make it explicit what scope link-netnsids are supposed to have.
/sys/class/net/...
NOT to the Rescue?
I've looked into /sys/class/net/if/… but can only find the ifindex and iflink elements; these are well documented. "ip link show" also only seems to show the peer ifindex in form of the (in)famous "@if#" notation. Or did I miss some additional network namespace element?
Bottom Line/Question
Are there any syscalls that allow retrieving the missing network namespace information for the peer end of a veth pair?
Best Answer
Here's the method I followed to find how to understand this problem. Available tools appear usable (with some convolution) for the namespace part, and (UPDATED) using /sys/ can easily get the peer's index. So it's quite long, bear with me. It's in two parts (which are not in the logical order, but namespace first helps explain the the index naming), using common tools, not any custom program:
Network namespace
This information is available with the property
link-netnsid
in the output ofip link
and can be matched with the id in the output ofip netns
. It's possible to "associate" a container's network namespace withip netns
, thus usingip netns
as a specialized tool. Of course doing a specific program for this would be better (some informations about syscalls at the end of each part).About the nsid's description, here's what
man ip netns
tells (emphasis mine):While creating a namespace with
ip netns
won't immediately create a netnsid, it will be created (on the current namespace, probably the "host") whenever a veth half is set to an other namespace. So it's always set for a typical container.Here's an example using an LXC container:
A new veth link
veth9RPX4M
appeared (this can be tracked withip monitor link
). Here are the detailed informations:This link has the property
link-netnsid 4
, telling the other side is in the network namespace with nsid 4. How to verify it's the LXC container? The easiest way to get this information is makingip netns
believe it created the container's network namespace, by doing the operations hinted in the manpage.UPDATE3: I didn't understand that finding back the global name was a problem. Here it is:
Now the information is retrieved with:
It confirms the veth's peer is in the network namespace with the same nsid = 4 = link-netnsid.
The container/
ip netns
"association" can be removed (without removing the namespace as long as the container is running):Note: the nsid naming is per network namespace, usually starts with 0 for the first container, and the lowest value available is recycled with new namespaces.
About using syscalls, here are informations guessed from strace:
for the link part: it requires an
AF_NETLINK
socket (opened withsocket(AF_NETLINK, SOCK_RAW, NETLINK_ROUTE)
), asking (sendmsg()
) the link's informations with a message typeRTM_GETLINK
and retrieving (recvmsg()
) the reply with message typeRTM_NEWLINK
.for the netns nsid part: same method, the query message is type
RTM_GETNSID
with reply typeRTM_NEWNSID
.I think the slightly higher level libraries to handle this are there: libnl. Anyway it's a topic for SO.
Interface index
Now it will be easier to follow why the index appear to have random behaviours. Let's do an experiment:
First enter a new net namespace to have a clean (index) slate:
As OP noted, lo begins with index 1.
Let's add 5 net namespaces, create veth pairs, then put a veth end on them:
When it's displaying @if2 for each of them it becomes quite clear it's the peer's namespace interface index and index are not global, but per namespace. When it's displaying an actual interface name, it's a relation to an interface in the same name space (be it veth's peer, bridge, bond ...). So why veth0 doesn't have a peer displayed? I believe it's an
ip link
bug when the index is the same as itself. Just moving twice the peer link "solves" it here, because it forced an index change. I'm also sure sometimesip link
do other confusions and instead of displaying @ifXX, displays an interface in the current namespace with the same index.UPDATE: reading again informations in OP's question, the peer's index (but not nsid) is easily and unambiguously available with
cat /sys/class/net/
interface
/iflink
.UPDATE2:
All those iflink 2 may appear ambiguous, but what is unique is the combination of nsid and iflink, not iflink alone. For the above example that is:
In this namespace (namely namespace
test
) there will never be two same nsid:pair .If one was to look from each peer network the opposite information:
But bear in mind that all the
0:
there is for each one a separate 0, that happens to map to the same peer namespace (namely: namespacetest
, not even the host). They can't be directly compared because they're tied to their namespace. So the whole comparable and unique information should be:Once it's confirmed that "test0:0" == "test1:0" etc. (true in this example, all map to the net namespace called
test
byip netns
) then they can be really compared.About syscalls, still looking at strace results,the information is retrieved as above from
RTM_GETLINK
. Now there should be all informations available:local: interface index with
SIOCGIFINDEX
/if_nametoindex
peer: both nsid and interface index with
RTM_GETLINK
.All this should probably be used with libnl.