The technical explanation of what is unprivileged container is quite good. However, it is not for ordinary PC user. Is there a simple answer when and why people should use unpriviliged containers, and what are their benefits and downsides?
What are benefits and downsides of unprivileged containers
lxcnot-root-userprivilegesSecurityvirtual machine
Related Solutions
I was just doing something very similar, moving KVM VMs into unprivileged LXC.
I was using system containers for this (so they can be started automatically on boot), but with mapped UID/GIDs (user namespaces).
- edit /etc/subuid,subgid (I mapped uid/gids 10M-100M to root and use 100K per container)
- for first container, use u/gids 10000000-10099999 in /var/lib/lxc/CTNAME/config
- mount the container storage on /var/lib/lxc/CTNAME/rootfs (or do nothing if you don't use separate volume/dataset/whatever per container)
- chown 10000000:10000000 /var/lib/lxc/CTNAME/rootfs
- setfacl -m u:10000000:x /var/lib/lxc (or simply chmod o+x /var/lib/lxc)
- lxc-usernsexec -m b:0:10000000:100000 -- /bin/bash
Now you're in the first container user namespace. Everything is the same, but your process thinks it's uid is 0, when in fact in the host namespace it's uid 10000000. Check /proc/self/uid_map to see whether your uid is mapped or not. You will notice you can no longer read from /root and it seems to be owned by nobody/nogroup.
While in the user namespace, I rsync from the original host.
Outside the user namespace, you will see that the files in /var/lib/lxc/CTNAME/rootfs are now owned not by the expected (same) uids as the origin installation, but rather 10000000+remote_uid. This is what you want.
That's it. When you have your data sync'ed, remove everything from container's /etc/fstab so it won't try to mount things, and it should start. There might be other things to change, check what the LXC template for the containerised distro does. You can definitely remove the kernel, grub, ntp and any hardware-probing packages in the container (you don't even have to run it, you can chroot to the container from the user namespace)
If you don't have a running remote VM, you can also mount the original VM storage in the host namespace and rsync/SSH back in to localhost. The effect will be the same.
If you (as it seems) simply want to change your privileged container to unprivileged, you might as well just add the uid/gid mapping, add a mapping as above to your container config and then do something along the lines of:
for i in `seq 0 65535`; do
find /var/lib/lxc/CTNAME/rootfs -uid $i -exec chown $((10000000+i)) \{\} \;
find /var/lib/lxc/CTNAME/rootfs -gid $i -exec chgrp $((10000000+i)) \{\} \;
done
That should be all that needs doing, now you should be able to run the container unprivileged. This example above is extremely inefficient, uidshift will probably do a better job at this (but I haven't used it yet).
HTH.
A Virtual Machine (VM) is quite a generic term for many virtualisation technologies.
There are a many variations on virtualisation technologies, but the main ones are:
- Hardware Level Virtualisation
- Operating System Level Virtualisation
qemu-kvm
and VMWare
are examples of the first. They employ a hypervisor to manage the virtual environments in which a full operating system runs. For example, on a qemu-kvm
system you can have one VM running FreeBSD, another running Windows, and another running Linux.
The virtual machines created by these technologies behave like isolated individual computers to the guest. These have a virtual CPU, RAM, NIC, graphics etc which the guest believes are the genuine article. Because of this, many different operating systems can be installed on the VMs and they work "out of the box" with no modification needed.
While this is very convenient, in that many OSes will install without much effort, it has a drawback in that the hypervisor has to simulate all the hardware, which can slow things down. An alternative is para-virtualised hardware, in which a new virtual device and driver is developed for the guest that is designed for performance in a virtual environment. qemu-kvm
provide the virtio
range of devices and drivers for this. A downside to this is that the guest OS must be supported; but if supported, the performance benefits are great.
lxc
is an example of Operating System Level Virtualisation, or containers. Under this system, there is only one kernel installed - the host kernel. Each container is simply an isolation of the userland processes. For example, a web server (for instance apache
) is installed in a container. As far as that web-server is concerned, the only installed server is itself. Another container may be running a FTP server. That FTP server isn't aware of the web-server installation - only it's own. Another container can contain the full userland installation of a Linux distro (as long as that distro is capable of running with the host system's kernel).
However, there are no separate operating system installations when using containers - only isolated instances of userland services. Because of this, you cannot install different platforms in a container - no Windows on Linux.
Containers are usually created by using a chroot
. This creates a separate private root (/
) for a process to work with. By creating many individual private roots, processes (web-servers, or a Linux distro, etc) run in their own isolated filesystem. More advanced techniques, such as cgroups
can isolate other resources such as network and RAM.
There are pros and cons to both and many long running debates as to which is best.
- Containers are lighter, in that a full OS isn't installed for each; which is the case for hypervisors. They can therefore run on lower spec'd hardware. However, they can only run Linux guests (on Linux hosts). Also, because they share the kernel, there is the possibility that a compromised container may affect another.
- Hypervisors are more secure and can run different OSes because a full OS is installed in each VM and guests are not aware of other VMs. However, this utilises more resources on the host, which has to be relatively powerful.
Related Question
- Linux – LXC: Any security difference between root and end-user owned unprivileged containers
- What are the dangers of having writable chroot directory for FTP
- Debian – What are the benefits of running a docker container inside a VM vs running docker containers on bare metal
- Linux – How to Configure Unprivileged Containers
Best Answer
Running unprivileged containers is the safest way to run containers in a production environment. Containers get bad publicity when it comes to security and one of the reasons is because some users have found that if a user gets root in a container then there is a possibility of gaining root on the host as well. Basically what an unprivileged container does is mask the userid from the host . With unprivileged containers, non root users can create containers and will have and appear in the container as root but will appear as userid 10000 for example on the host (whatever you map the userids as). I recently wrote a blog post on this based on Stephane Graber's blog series on LXC (One of the brilliant minds/lead developers of LXC and someone to definitely follow). I say again, extremely brilliant.
From my blog:
From the container:
From the host:
As you can see processes are running inside the container as root but are not appearing as root but as 100000 from the host.
So to sum up: Benefits - added security and added isolation for security. Downside - A little confusing to wrap your head around at first and not for the novice user.