What do the processes inside a Docker container look like

dockervirtual machine

I've heard confusion come up several times recently around what a Docker container is, and more specifically what's going on inside, with respect to commands & processes that I invoke while inside a Docker container.

Can someone please provide a high level overview of what's going on?

Best Answer

Docker gets thrown into the virtualization bucket, because people assume that it's somehow virtualizing the hardware underneath. This is a misnomer that permeates from the terminology that Docker makes use of, mainly the term container.

However Docker is not doing anything magical with respect to virtualizing a system's hardware. Rather it's making use of the Linux Kernel's ability to construct "fences" around key facilities, which allows for a process to interact with resources such as network, the filesystem, and permissions (among other things) to give the illusion that you're interacting with a fully functional system.

Here's an example that illustrates what's going on when we start up a Docker container and then enter it through the invocation of /bin/bash.

$ docker run -it ubuntu:latest /bin/bash
root@c0c5c54062df:/#

Now from inside this container, if we run ps -eaf:

Switching to another terminal tab where we're logged into the host system that's hosting the Docker container, we can see the process space that the container is "actually" taking up:

Now if we go back to the Docker tab and launch several processes within it and background them all, we can see that we now have several child processes running under the primary Bash process which we originally started as part of the Docker container launch.

NOTE: The processes are 4 sleep 1000 commands which are being backgrounded.

Notice how inside the Docker container the processes are assigned process IDs (PIDs) of 48-51. See them in the ps -eaf output in their as well:

However, with this next image, much of the "magic" that Docker is performing is revealed.

See how the 4 sleep 1000 processes are actually just child processes to our original Bash process? Also take note that our original Docker container /bin/bash is in fact a child process to the Docker daemon too.

Now if we were to wait 1000+ seconds for the original sleep 1000 commands to finish, and then run 4 more new ones, and start another Docker container up like this:

$ docker run -it ubuntu:latest /bin/bash
root@450a3ce77d32:/#

The host computer's output from ps -eaf would look like this:

And other Docker containers, will all just show up as processes under the Docker daemon.

So you see, Docker is really not virtualizing (in the traditional sense), it's constructing "fences" around the various Kernel resources and limiting the visibility to them for a given process + children.

Regarding your main points:

Both Docker and KVM have ways to save their current state, no added benefit here

Except that how they store their state is different, and one method or the other may be more efficient. Also, you can't reliably save 100% of the state of a container.

Both Docker and KVM can be provided separate IP's for network use

Depending on what VM and container system you use, this may be easier to set up for VM's than for containers. This is especially true if you want a dedicated layer 2 interface for the VM/container, which is almost always easier to do with a VM.

Both Docker and KVM separate running programs and installs from conflicting with host running processes

VM's do it better than containers. Containers are still making native system calls to the host OS. That means they can potentially directly exploit any bugs in those system calls. VM's have their own OS, so they're much better isolated.

Both Docker and KVM provide easy ways to scale with enterprise growth

This is about even, though I've personally found that VM's done right scale a bit better than containers done right (most likely because VM's done right offload the permissions issues to the hardware, while containers need software to handle it).

Both Provide simple methods of moving instances to different hosts

No, not exactly. Both can do offline migration, but a lot of container systems can't do live migration (that is, moving a running container from one host to another). Live migration is very important for manageability reasons if you're running at any reasonable scale (Need to run updates on the host? Migrate everything to another system, reboot the host, migrate everything off of the second host to the first, reboot that, rebalance.).

Some extra points:

VM's generally have easier to work with high-availability options. This isn't to say that containers don't have such options, just that they're typically easier to work with and adapt application code to with VM's.
VM's are a bit easier to migrate directly to and from cloud hosting (you don't have to care to quite the same degree what the underlying hosting environment is like).
VM's let you run a different platform from the host OS. Even different Linux distributions have sufficient differences in their kernel configuration that stuff written for one is not completely guaranteed to work on another.
VM's give you better control of the potential attack surface. With containers, you just can't get rid of the fact that the code for your host OS is still in memory, and therefore a potential attack vector. With VM's, you're running an isolated OS, so you can strip it down to the absolute minimum of what you actually need.
Running a group of related containers together in a VM gives you an easy foolproof way to start and stop that group of containers together.

Best Answer

Related Solutions

docker – How to Run a Program Inside a Docker Container

Debian – What are the benefits of running a docker container inside a VM vs running docker containers on bare metal