Linux – Building unprivileged (userns) LXC container from scratch, by migrating a privileged container to be unprivileged

linuxlxcuserns

How can I build a privileged LXC (1.0.3) container (that part I know) and then migrate it successfully to be run unprivileged? That is, I'd like to debootstrap it myself or adjust the lxc-ubuntu template (commonly under /usr/share/lxc/templates) in order for this to work.

Here's why I am asking this question. If you look at the lxc-ubuntu template, you'll notice:

# Detect use under userns (unsupported)
for arg in "$@"; do
    [ "$arg" = "--" ] && break
    if [ "$arg" = "--mapped-uid" -o "$arg" = "--mapped-gid" ]; then
        echo "This template can't be used for unprivileged containers." 1>&2
        echo "You may want to try the \"download\" template instead." 1>&2
        exit 1
    fi
done

Following the use of LXC_MAPPED_GID and LXC_MAPPED_UID in the referenced lxc-download template, though, there seems to be nothing particularly special. In fact all it does is to adjust the file ownership (chgrp + chown). But it's possible that the extended attributes in the download template are fine-tuned already to accomplish whatever "magic" is needed.

In the comments to this blog post by Stéphane Graber Stéphane tells a commenter that

There’s no easy way to do that unfortunately, you’d need to update
your container config to match that from an unprivileged container,
move the container’s directory over to the unprivileged user you want
it to run as, then use Serge’s uidshift program to change the
ownership of all files.

… and to:

But there are no further pointers.

So my question is: how can I take an ordinary (privileged) LXC container that I have built myself (having root and all) and migrate it to become an unprivileged container? Even if you can't provide a script or so, it would be great to know which points to consider and how they affect the ability to run the unprivileged LXC container. I can come up with a script on my own and pledge to post it as an answer to this question if a solution can be found 🙂

Note: Although I am using Ubuntu 14.04, this is a generic question.

Best Answer

I was just doing something very similar, moving KVM VMs into unprivileged LXC.

I was using system containers for this (so they can be started automatically on boot), but with mapped UID/GIDs (user namespaces).

  1. edit /etc/subuid,subgid (I mapped uid/gids 10M-100M to root and use 100K per container)
  2. for first container, use u/gids 10000000-10099999 in /var/lib/lxc/CTNAME/config
  3. mount the container storage on /var/lib/lxc/CTNAME/rootfs (or do nothing if you don't use separate volume/dataset/whatever per container)
  4. chown 10000000:10000000 /var/lib/lxc/CTNAME/rootfs
  5. setfacl -m u:10000000:x /var/lib/lxc (or simply chmod o+x /var/lib/lxc)
  6. lxc-usernsexec -m b:0:10000000:100000 -- /bin/bash

Now you're in the first container user namespace. Everything is the same, but your process thinks it's uid is 0, when in fact in the host namespace it's uid 10000000. Check /proc/self/uid_map to see whether your uid is mapped or not. You will notice you can no longer read from /root and it seems to be owned by nobody/nogroup.

While in the user namespace, I rsync from the original host.

Outside the user namespace, you will see that the files in /var/lib/lxc/CTNAME/rootfs are now owned not by the expected (same) uids as the origin installation, but rather 10000000+remote_uid. This is what you want.

That's it. When you have your data sync'ed, remove everything from container's /etc/fstab so it won't try to mount things, and it should start. There might be other things to change, check what the LXC template for the containerised distro does. You can definitely remove the kernel, grub, ntp and any hardware-probing packages in the container (you don't even have to run it, you can chroot to the container from the user namespace)

If you don't have a running remote VM, you can also mount the original VM storage in the host namespace and rsync/SSH back in to localhost. The effect will be the same.

If you (as it seems) simply want to change your privileged container to unprivileged, you might as well just add the uid/gid mapping, add a mapping as above to your container config and then do something along the lines of:

for i in `seq 0 65535`; do
  find /var/lib/lxc/CTNAME/rootfs -uid $i -exec chown $((10000000+i)) \{\} \;
  find /var/lib/lxc/CTNAME/rootfs -gid $i -exec chgrp $((10000000+i)) \{\} \;
done

That should be all that needs doing, now you should be able to run the container unprivileged. This example above is extremely inefficient, uidshift will probably do a better job at this (but I haven't used it yet).

HTH.

Related Question