I ran into the same problem testing my Ansible playbooks which require systemd. And as you said, docker seems like the best approach here as it is much easier to bring up and down a container rather than a virtual machine.
First of all base/archlinux image is deprecated - you should use archlinux/base instead.
Then, to run systemd totally unprivileged, number of things should be done:
- provide a
container=
variable, so systemd won't try to do number of things it usually does booting a hardware machine
- systemd actively uses cgroups, so bind mount
/sys/fs/cgroup
file system from a host
- bind mounting
/sys/fs/fuse
is not required but helps to avoid issues with fuse-dependent software
- systemd thinks that using
tmpfs
everywhere is a good approach, but running unprivileged makes it impossible for it to mount tmpfs
where ever it wants, so pre-mount tmpfs
to /tmp
, /run
and /run/lock
- as the last bit you need to specify
sysinit.target
as default unit to boot instead of multi-user.target
or whatever, as you really do not want to start graphical things inside a container
The resulting command line is
docker run \
--entrypoint=/usr/lib/systemd/systemd \
--env container=docker \
--mount type=bind,source=/sys/fs/cgroup,target=/sys/fs/cgroup \
--mount type=bind,source=/sys/fs/fuse,target=/sys/fs/fuse \
--mount type=tmpfs,destination=/tmp \
--mount type=tmpfs,destination=/run \
--mount type=tmpfs,destination=/run/lock \
archlinux/base --log-level=info --unit=sysinit.target
If we are talking about running particular service there like ntpd from your example you will need to add
--cap-add=SYS_TIME
otherwise ntpd
will fail with permission deny as nobody wants a container to set system time by default.
P.s I spent quite a while learning how systemd behaves and managed to get it working on number of operating system images. I described my experience in an article Running systemd in docker container. It is in Russian but I believe google translate should work in your browser. Thanks
Of course it does.
If they weren't started by the service management subsystem, they won't be tracked by the service management subsystem. And they won't be proper dæmons, really.
background on van Smoorenburg rc
compatibility mechanisms
further reading: https://unix.stackexchange.com/a/233581/5132
The van Smoorenburg rc
compatibility mechanism provided by systemd is a generator. It ensures that there is a generated abc.service
service that runs /etc/init.d/abc start
at service start and /etc/init.d/abc stop
at service stop.
Note, at this point, that the existence of /usr/lib/systemd/system/abc.service
completely prevents the generation of an abc.service
by systemd's generator.
This is the entire extent of van Smoorenburg rc
compatibility in vanilla systemd. The ability for the superuser to directly invoke /etc/init.d/abc
in various ways, and have that connect up to systemd is provided as an augmentation to vanilla systemd by the operating system's individual developers.
The Debian and Ubuntu people, for example, provide a hook in their own /lib/lsb/init-functions.d/
subsystem that acts as follows:
- If the hook detects that the
init.d
script is being invoked as the ExecStart/ExecStop of a generated systemd service, it does nothing special and just runs the rest of the script as it stands.
- If the hook detects that the
init.d
script is being invoked directly, not as part of a generated systemd service, it transforms /etc/init.d/name verb
into systemctl verb name
and switches to that, not running the rest of the script. (Some verbs it passes through, but you are talking of start
and stop
, which are handled in the way explained here.)
in the absence of the compatibility mechanisms
Not all operating systems have such a van Smoorenburg rc
compatibility mechanism that translates the direct invocation of /etc/init.d/name verb
into systemd's way of doing things. Some operating systems (such as the one run by the person at "Why does `init 0` result in "Excess Arguments" on Arch install?", for example) provide no van Smoorenburg rc
compatibility at all, not providing a hook like Debian's/Ubuntu's and even outright disabling the compatibility mechanism that comes with vanilla systemd.
On such an operating system, running the van Smoorenburg rc
script directly just runs that script as-is.
Such a script does not start a service under service management. It does things such as double-forking and whatnot, in vain and in most cases fruitless attempts to run in the same environment that actual service dæmons run in. (Much of this so-called "dæmonization" stuff does not work and has not worked since the 1980s; this being the reason that proper dæmon management systems were invented in the early 1990s in the first place.) But as far as service management is concerned it is just the superuser in an interactive login session forking stuff.
Indeed, systemd will consider such directly invoked van Smoorenburg rc
scripts and all of the vainly "dæmonized" programs that they fork off into the background to be running as part of the user's interactive session scope inside the user's slice, not as services that run outwith user sessions in the system slice.
Worse, it ends up using the highly flawed mechanisms of the van Smoorenburg rc
system, such as killing all processes running anywhere that match the service name at service stop, instead of just the specific service processes that the service manager started and is tracking. This is why /etc/init.d/name stop
is appearing to work for you. The script is killing all processes that match a name, which just happens to also include the processes that run under the service manager. This indiscriminate killing of everything is a bug, though, not a feature. This is only the appearance of proper functionality, and it will bite you down the road, as it has bitten so many system administrators in the past several decades.
the correct thing to do
If you lack such compatibility mechanisms, then do not invoke van Smoorenburg rc
scripts directly. It is as simple as that. Use the service
or systemctl
commands to communicate with systemd's service management; but do not run /etc/init.d/anything
directly for stopping, starting, and obtaining the statuses of services.
A subordinate point is that you should not be mucking around with /usr/lib/systemd/system/abc.service
just to try to bodge the van Smoorenburg rc
script into sort-of working. Type=forking
is almost certainly wrong for your service. (It does not match almost all actual services in the wild.) And if the people who came up with /usr/lib/systemd/system/abc.service
managed to get rid of the well-known-to-be-broken PID file mechanism that is wholly unnecessary for true service management, it is outright daft to be putting it back in again.
Further reading
Best Answer
I've managed to fix this issue in a CentOS:7 Docker container. I've followed mainly the Guide on CentOS Docker image project.
Now, build the image, and run it using at least the following arguments to
docker run
command:-v /run -v /sys/fs/cgroup:/sys/fs/cgroup:ro
Then main point is that
/usr/sbin/init
must be the first process inside the Docker container.So if you want to use a custom script that executes some commands before running
/usr/sbin/init
, launch it at the end of your script usingexec /usr/sbin/init
(in a bash script).Here is an example:
And here is the content of
cmd.sh
:You could have
System is booting up. See pam_nologin(8)
if your using the PAM system, in that case, delete/usr/lib/tmpfiles.d/systemd-nologin.conf
in yourDockerfile
because it creates the file/var/run/nologin
which generates this specific error.