How to make a device available inside a systemd-nspawn container with user namespacing

bind-mountcontainernamespacesystemd-nspawnusers

I would like to mount an encrypted image file using cryptsetup inside a systemd-nspawn container. However, I get this error message:

[root@container ~]# echo $key | cryptsetup -d - open luks.img luks
Cannot initialize device-mapper. Is dm_mod kernel module loaded?
Cannot use device luks, name is invalid or still in use.

The dm_mod kernel module is loaded on the host system, although things look a bit weird inside the container:

[root@host ~]# grep dm_mod /proc/modules
dm_mod 159744 2 dm_crypt, Live 0xffffffffc12c6000

[root@container ~]# grep dm_mod /proc/modules
dm_mod 159744 2 dm_crypt, Live 0x0000000000000000

strace indicates that cryptsetup is unable to create /dev/mapper/control:

[root@etrial ~]# echo $key | strace cryptsetup -d - open luks.img luks 2>&1 | grep mknod
mknod("/dev/mapper/control", S_IFCHR|0600, makedev(0xa, 0xec)) = -1 EPERM (Operation not permitted)

I am not too sure why this is happening. I am starting the container with the systemd-nspawn@.service template unit, which seems like it should allow access to the device mapper:

# nspawn can set up LUKS encrypted loopback files, in which case it needs
# access to /dev/mapper/control and the block devices /dev/mapper/*.
DeviceAllow=/dev/mapper/control rw
DeviceAllow=block-device-mapper rw

Reading this comment on a related question about USB devices, I wondered whether the solution was to add a bind mount for /dev/mapper. However, cryptsetup gives me the same error message inside the container. When I strace it, it looks like there's still a permissions issue:

# echo $key | strace cryptsetup open luks.img luks --key-file - 2>&1 | grep "/dev/mapper"
stat("/dev/mapper/control", {st_mode=S_IFCHR|0600, st_rdev=makedev(0xa, 0xec), ...}) = 0
openat(AT_FDCWD, "/dev/mapper/control", O_RDWR) = -1 EACCES (Permission denied)

# ls -la /dev/mapper
total 0
drwxr-xr-x 2 nobody nobody      60 Dec 13 14:33 .
drwxr-xr-x 8 root   root       460 Dec 15 14:54 ..
crw------- 1 nobody nobody 10, 236 Dec 13 14:33 control

Apparently, this is happening because the template unit enables user namespacing, which I want anyway for security reasons. As explained in the documentation:

In most cases, using --private-users=pick is the recommended option as it enhances container security massively and operates fully automatically in most cases … [this] is the default if the systemd-nspawn@.service template unit file is used …

Note that when [the --bind option] is used in combination with --private-users, the resulting mount points will be owned by the nobody user. That's because the mount and its files and directories continue to be owned by the relevant host users and groups, which do not exist in the container, and thus show up under the wildcard UID 65534 (nobody). If such bind mounts are created, it is recommended to make them read-only, using --bind-ro=.

Presumably I won't be able to do anything with read-only permissions to /dev/mapper. So, is there any way I can get cryptsetup to work inside the container, so that my application can create and mount arbitrary encrypted volumes at runtime, without disabling user namespacing?

Related questions

Best Answer

It might be that you are missing the m permission. What happens if you specify rwm insteadof rw on at least your first DeviceAllow?

Related Question