How to create 100 virtual servers for testing

virtual machine

I want to create 100 virtual servers. They will be used for testing, so they should be easy to create and destroy.

  • They must be accessible through SSH from another physical machine (I provide the public ssh-key)
  • They must have their own IP-address and be accessible from another physical host as ssh I.P.n.o e.g. ssh 10.0.0.99 (IPv4 or IPv6, private address space OK, port-forwarding is not – so this may involve setting up a bridge)
  • They must have basic UNIX tools installed (preferably a full distro)
  • They must have /proc/cpuinfo, a root user, and a netcard (This is probably only relevant if the machine is not fully virtualized)
  • Added bonus if they can be made to run an X server that can be connected to remotely (using VNC or similar)

What is the fastest way (wall clock time) to do this given:

  • The host system runs Ubuntu 20.04 and has plenty of RAM and CPU
  • The LAN has a DHCP-server (it is also OK to use a predefined IP-range)
  • I do not care which Free virtualization technology is used (Containerization is also OK if the other requirements are met)

and what are the actual commands I should run/files I should create?

I have the feeling that given the right technology this is a 50 line job that can be set up in minutes.

The few lines can probably be split into a few bash functions:

install() {
  # Install needed software once
}
setup() {
  # Configure the virtual servers
}
start() {
  # Start the virtual servers
  # After this it is possible to do:
  #   ssh 10.0.0.99
  # from another physical server
}
stop() {
  # Stop the virtual servers
  # After there is no running processes on the host server
  # and after this it is no longer possible to do:
  #   ssh 10.0.0.99
  # from another physical server
  # The host server returns to the state before running `start`
}
destroy() {
  # Remove the setup
  # After this the host server returns to the state before running `setup`
}

Background

For developing GNU Parallel I need an easy way to test running on 100 machines in parallel.

For other projects it would also be handy to be able to create a bunch of virtual machines, test some race conditions and then destroy the machines again.

In other words: This is not for a production environment and security is not an issue.

Docker

Based on @danielleontiev's notes below:

install() {
    # Install needed software once
    sudo apt -y install docker.io
    sudo groupadd docker
    sudo usermod -aG docker $USER
    # Logout and login if you were not in group 'docker' before
    docker run hello-world
}
setup() {
    # Configure the virtual servers
    mkdir -p my-ubuntu/ ssh/
    cp ~/.ssh/id_rsa.pub ssh/
    cat ssh/*.pub > my-ubuntu/authorized_keys
    cat >my-ubuntu/Dockerfile <<EOF
FROM ubuntu:bionic
RUN apt update && \
    apt install -y openssh-server
RUN mkdir /root/.ssh
COPY authorized_keys /root/.ssh/authorized_keys
# run blocking command which prevents container to exit immediately after start.
CMD service ssh start && tail -f /dev/null
EOF
    docker build my-ubuntu -t my-ubuntu
}
start() {
    testssh() {
        ssh -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null root@"$1" echo "'$1'" '`uptime`'
    }
    export -f testssh
    # Start the virtual servers
    seq 100 |
        parallel 'docker run -d --rm --name my-ubuntu-{} my-ubuntu; docker inspect my-ubuntu-{}' |
        # After this it is possible to do:
        #   ssh 10.0.0.99
        # from another physical server
        perl -nE '/"IPAddress": "(\S+)"/ and not $seen{$1}++ and say $1' |
        parallel testssh
    docker ps
}
stop() {
    # Stop the virtual servers
    # After there is no running processes on the host server
    # and after this it is no longer possible to do:
    #   ssh 10.0.0.99
    # from another physical server
    # The host server returns to the state before running `start`
    seq 100 | parallel docker stop my-ubuntu-{}
    docker ps
}
destroy() {
    # Remove the setup
    # After this the host server returns to the state before running `setup`
    rm -rf my-ubuntu/
    docker rmi my-ubuntu
}

full() {
    install
    setup
    start
    stop
    destroy
}

$ time full
real    2m21.611s
user    0m47.337s
sys     0m31.882s

This takes up 7 GB RAM in total for running 100 virtual servers. So you do not even need to have plenty of RAM to do this.

It scales up to 1024 servers after which the docker bridge complains (probably due to Each Bridge Device can have up to a maximum of 1024 ports).

Only thing missing now is making the docker bridge talk to the ethernet, so the containers are accessible from another physical server.

Vagrant

Based on @Martin's notes below:

install() {
    # Install needed software once
    sudo apt install -y vagrant virtualbox
}
setup() {
    # Configure the virtual servers
    mkdir -p ssh/
    cp ~/.ssh/id_rsa.pub ssh/
    cat ssh/*.pub > authorized_keys
    cat >Vagrantfile <<'EOF'
Vagrant.configure("2") do |config|
  config.vm.box = "debian/buster64"
  (1..100).each do |i|
    config.vm.define "vm%d" % i do |node|
      node.vm.hostname = "vm%d" % i
      node.vm.network "public_network", ip: "192.168.1.%d" % (100+i)
    end
  end

  config.vm.provision "shell" do |s|
    ssh_pub_key = File.readlines("authorized_keys").first.strip
    s.inline = <<-SHELL
      mkdir /root/.ssh
      echo #{ssh_pub_key} >> /home/vagrant/.ssh/authorized_keys
      echo #{ssh_pub_key} >> /root/.ssh/authorized_keys
      apt-get update
      apt-get install -y parallel
    SHELL
  end
end
EOF
}
start() {
    testssh() {
        ssh -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null root@"$1" echo "'$1'" '`uptime`'
    }
    export -f testssh
    # Start the virtual servers
    seq 100 | parallel --lb vagrant up vm{}
    # After this it is possible to do:
    #   ssh 192.168.1.111
    # from another physical server
    parallel testssh ::: 192.168.1.{101..200}
}
stop() {
    # Stop the virtual servers
    # After there is no running processes on the host server
    # and after this it is no longer possible to do:
    #   ssh 10.0.0.99
    # from another physical server
    # The host server returns to the state before running `start`
    seq 100 | parallel vagrant halt vm{}
}
destroy() {
    # Remove the setup
    # After this the host server returns to the state before running `setup`
    seq 100 | parallel vagrant destroy -f vm{}
    rm -r Vagrantfile .vagrant/
}

full() {
    install
    setup
    start
    stop
    destroy
}

start gives a lot of warnings:

NOTE: Gem::Specification.default_specifications_dir is deprecated; use Gem.default_specifications_dir instead. It will be removed on or after 2020-02-01.

stop gives this warning:

NOTE: Gem::Specification.default_specifications_dir is deprecated; use Gem.default_specifications_dir instead. It will be removed on or after 2020-02-01.
Gem::Specification.default_specifications_dir called from /usr/share/rubygems-integration/all/gems/vagrant-2.2.6/lib/vagrant/bundler.rb:428.
NOTE: Gem::Specification.default_specifications_dir is deprecated; use Gem.default_specifications_dir instead. It will be removed on or after 2020-02-01.
Gem::Specification.default_specifications_dir called from /usr/share/rubygems-integration/all/gems/vagrant-2.2.6/lib/vagrant/bundler.rb:428.
/usr/share/rubygems-integration/all/gems/vagrant-2.2.6/plugins/kernel_v2/config/vm.rb:354: warning: Using the last argument as keyword parameters is deprecated; maybe ** should be added to the call
/usr/share/rubygems-integration/all/gems/vagrant-2.2.6/plugins/kernel_v2/config/vm_provisioner.rb:92: warning: The called method `add_config' is defined here
/usr/share/rubygems-integration/all/gems/vagrant-2.2.6/lib/vagrant/errors.rb:103: warning: Using the last argument as keyword parameters is deprecated; maybe ** should be added to the call
/usr/share/rubygems-integration/all/gems/i18n-1.8.2/lib/i18n.rb:195: warning: The called method `t' is defined here

Each virtual machine takes up 0.5 GB of RAM on the host system.

It is much slower to start than the Docker machines above. The big difference is that the Vagrant-machines do not have to run the same kernel as the host, but are complete virtual machines.

Best Answer

I think docker meets your requirements.

1) Install docker (https://docs.docker.com/engine/install/) Make sure you are done with linux post installation steps (https://docs.docker.com/engine/install/linux-postinstall/)

2) I assume you have the following directory structure:

.
└── my-ubuntu
    ├── Dockerfile
    └── id_rsa.pub

1 directory, 2 files

id_rsa.pub is your public key and Dockerfile we will discuss below

3) First, we are going to build docker image. It's like template for containers that we are going to run. Each container would be something like materialization of our image.

4) To build image we need a template. It's Dockerfile:

FROM ubuntu:bionic
RUN apt update && \
    apt install -y openssh-server
RUN mkdir /root/.ssh
COPY id_rsa.pub /root/.ssh/authorized_keys

CMD service ssh start && tail -f /dev/null

  • FROM ubuntu:bionic defines our base image. You can find base for Arch, Debian, Apline, Ubuntu, etc on hub.docker.com
  • apt install part installs ssh server
  • COPY from to copies our public key to the place where it will be in the container
  • Here you could add more RUN statements to do additional things: install software, create files, etc...
  • The last is tricky. The first part starts ssh server when we start container which is obvious but the second is important - it runs blocking command which prevents container to exit immediately after start.

5) docker build my-ubuntu -t my-ubuntu - builds image. The output of this command:

Sending build context to Docker daemon  3.584kB
Step 1/5 : FROM ubuntu:bionic
 ---> c3c304cb4f22
Step 2/5 : RUN apt update &&     apt install -y openssh-server
 ---> Using cache
 ---> 40c56d549c0e
Step 3/5 : RUN mkdir /root/.ssh
 ---> Using cache
 ---> c50d8b614b21
Step 4/5 : COPY id_rsa.pub /root/.ssh/authorized_keys
 ---> Using cache
 ---> 34d1cf4e9f69
Step 5/5 : CMD service ssh start && tail -f /dev/null
 ---> Using cache
 ---> a442db47bf6b
Successfully built a442db47bf6b
Successfully tagged my-ubuntu:latest

6) Let's run my-ubuntu. (Once again my-ubuntu is the name of image). Starting container with name my-ubuntu-1 which is derived from my-ubuntu image:

docker run -d --rm --name my-ubuntu-1 my-ubuntu

Options:

  • -d demonize for running container in bg
  • --rm to erase container after container stops. It can be important because when you deal with a lot of containers they can quickly pollute you HDD.
  • --name name for container
  • my-ubuntu image we start from

7) Image is running. docker ps can prove this:

CONTAINER ID        IMAGE               COMMAND                  CREATED             STATUS              PORTS                NAMES
ee6bc20fd820        my-ubuntu           "/bin/sh -c 'service…"   5 minutes ago       Up 5 minutes         my-ubuntu-1

8) To execute command in the container run:

docker exec -it my-ubuntu-1 bash - to get into the container's bash. It is possible to provide any command

9) If running command the way above is not enough do docker inspect my-ubuntu-1 and grep IPAddress field. For my it's 172.17.0.2.

ssh root@172.17.0.2
Welcome to Ubuntu 18.04.4 LTS (GNU/Linux 5.6.15-arch1-1 x86_64)

10) To stop container: docker stop my-ubuntu-1

11) Now it is possible to run 100 containers:

#!/bin/bash

for i in $(seq 1 100); do
    docker run -d --rm --name my-ubuntu-$i my-ubuntu
done

My docker ps:

... and so on ...
ee2ccce7f642        my-ubuntu           "/bin/sh -c 'service…"   46 seconds ago      Up 45 seconds                            my-ubuntu-20
9fb0bfb0d6ec        my-ubuntu           "/bin/sh -c 'service…"   47 seconds ago      Up 45 seconds                            my-ubuntu-19
ee636409a8f8        my-ubuntu           "/bin/sh -c 'service…"   47 seconds ago      Up 46 seconds                            my-ubuntu-18
9c146ca30c9b        my-ubuntu           "/bin/sh -c 'service…"   48 seconds ago      Up 46 seconds                            my-ubuntu-17
2dbda323d57c        my-ubuntu           "/bin/sh -c 'service…"   48 seconds ago      Up 47 seconds                            my-ubuntu-16
3c349f1ff11a        my-ubuntu           "/bin/sh -c 'service…"   49 seconds ago      Up 47 seconds                            my-ubuntu-15
19741651df12        my-ubuntu           "/bin/sh -c 'service…"   49 seconds ago      Up 48 seconds                            my-ubuntu-14
7a39aaf669ba        my-ubuntu           "/bin/sh -c 'service…"   50 seconds ago      Up 48 seconds                            my-ubuntu-13
8c8261b92137        my-ubuntu           "/bin/sh -c 'service…"   50 seconds ago      Up 49 seconds                            my-ubuntu-12
f8eec379ee9c        my-ubuntu           "/bin/sh -c 'service…"   51 seconds ago      Up 49 seconds                            my-ubuntu-11
128894393dcd        my-ubuntu           "/bin/sh -c 'service…"   51 seconds ago      Up 50 seconds                            my-ubuntu-10
81944fdde768        my-ubuntu           "/bin/sh -c 'service…"   52 seconds ago      Up 50 seconds                            my-ubuntu-9
cfa7c259426a        my-ubuntu           "/bin/sh -c 'service…"   52 seconds ago      Up 51 seconds                            my-ubuntu-8
bff538085a3a        my-ubuntu           "/bin/sh -c 'service…"   52 seconds ago      Up 51 seconds                            my-ubuntu-7
1a50a64eb82c        my-ubuntu           "/bin/sh -c 'service…"   53 seconds ago      Up 51 seconds                            my-ubuntu-6
88c2e538e578        my-ubuntu           "/bin/sh -c 'service…"   53 seconds ago      Up 52 seconds                            my-ubuntu-5
1d10f232e7b6        my-ubuntu           "/bin/sh -c 'service…"   54 seconds ago      Up 52 seconds                            my-ubuntu-4
e827296b00ac        my-ubuntu           "/bin/sh -c 'service…"   54 seconds ago      Up 53 seconds                            my-ubuntu-3
91fce445b706        my-ubuntu           "/bin/sh -c 'service…"   55 seconds ago      Up 53 seconds                            my-ubuntu-2
54c70789d1ff        my-ubuntu           "/bin/sh -c 'service…"   2 minutes ago       Up 2 minutes         my-ubuntu-1

I can do f.e. docker inspect my-ubuntu-15, get its IP and connect to ssh to it or use docker exec.

It is possible to ping containters from containers (install iputils-ping to reproduce):

root@5cacaf03bf89:~# ping 172.17.0.2 
PING 172.17.0.2 (172.17.0.2) 56(84) bytes of data.
64 bytes from 172.17.0.2: icmp_seq=1 ttl=64 time=1.19 ms
64 bytes from 172.17.0.2: icmp_seq=2 ttl=64 time=0.158 ms
64 bytes from 172.17.0.2: icmp_seq=3 ttl=64 time=0.160 ms
^C
--- 172.17.0.2 ping statistics ---

N.B. running containers from bash is quick solution. If you would like scalable approach consider using kubernetes or swarm

P.S. Useful commands:

  • docker ps
  • docker stats
  • docker container ls
  • docker image ls

  • docker stop $(docker ps -aq) - stops all running containers

Also, follow the basics from docs.docker.com - it's 1 hour time spent for better experience working with containers

Additional:

Base image in the example is really minimal image. It does not have DE or even xorg. You could install it manually (adding packages to RUN apt install ... section) or use image that already has the software you need. Quick googling gives me this (https://github.com/fcwu/docker-ubuntu-vnc-desktop). I have never tried but I think it should work. If you are definitely need VNC access I should try to play around a bit and add info to the answer

Exposing to local network:

This one may be tricky. I am sure it can be done with some obscure port forwarding but the straightforward solution is to change running script as follows:

#!/bin/bash

for i in $(seq 1 100); do
    docker run -d --rm -p $((10000 + i)):22 --name my-ubuntu-$i my-ubuntu
done

After that you would be able to access your containers with host machine IP:

ssh root@localhost -p 10001
The authenticity of host '[localhost]:10001 ([::1]:10001)' can't be established.
ECDSA key fingerprint is SHA256:erW9kguSvn1k84VzKHrHefdnK04YFg8eE6QEH33HmPY.
Are you sure you want to continue connecting (yes/no/[fingerprint])? yes
Warning: Permanently added '[localhost]:10001' (ECDSA) to the list of known hosts.
Welcome to Ubuntu 18.04.4 LTS (GNU/Linux 5.6.15-arch1-1 x86_64)
Related Question