In other words: may root owned unprivileged containers be "less
unprivileged" than ones owned by standard accounts?
I don't think so. What matters is what's in /proc/$PID/uid_map
of processes in user namespace of the container, not what's in /etc/subuid
. Suppose you execute the following from the initial user namespace (that is, not from the container) for $PID
of a process running in the container:
$ cat /proc/$PID/uid_map
0 200000 1000
This means that UID range [0-1000)
of the process $PID will be mapped to UID range [200000-201000)
outside of its user namespace (of the container). UIDs outside of the [200000-201000)
range will be mapped to 65534 ($(cat /proc/sys/kernel/overflowuid)
) in the container. This can happen for instance if you don't create a new PID namespace. In that case, the process in the container would see processes outside, but their UID would be 65534.
So with proper UID mapping, even if the container is started by root, its processes will have unprivileged UIDs outside of it.
Subordinate UIDs in /etc/subuid
are not in any way linked to a single UID outside. The purpose of this file is to allow unprivileged users to start containers which use more than one UID (which is the case for most Linux operating systems). By default, you can only map your UID if you're unprivileged user. That is, if your UID is 1000 and $PID
refers to a process in the container, you can only do
echo "$N 1000 1" >/proc/$PID/uid_map
for any $N
as unprivileged user. Everything else is not permitted. If you could map longer range, i.e.
echo "$N 1000 50" >/proc/$PID/uid_map
you would gain access to UIDs [1000-1050)
outside of the container through the container. And of course, if you could change start of outer UID range, you'd have easy way to get root. So /etc/subuid
defines outer ranges which you are allowed to use. This file is used by newuidmap
which is setuid root.
$ cat /etc/subuid
woky:200000:50
$ echo '0 200000 50' >/proc/$PID/uid_map
-bash: echo: write error: Operation not permitted
$ newuidmap $PID 0 200000 50
$ # success
The details are much more complicated and I'm probably not the proper person to explain it but I guess it's better to have no answer. :-) You might want to check man pages user_namespaces(7)
and newuidmap(1)
, and my own research First process in a new Linux user namespace needs to call setuid()? . Unfortunately, I'm not entirely sure how LXC uses this file.
The rules are as following:
- If the user is the root-user (UID=0) grant full access
- If the user is the owner, use the owner-triplet as permission
- If the user isn't the owner but belong to the group, use the group-triplet as permission
- It the user is neither the owner nor member of the group, use the other-triplet as permission
So for root it doesn't matter much anyway - but for other users, it's the most specific permission-triplet that applies. So in your example, if the user is both the owner and a member of the group, it's the owner-triplet that's used (not the group-triplet). So if the group got both read-and-write-permission, but the owner only got read-permission; then the owner will only be allowed to read the file - even though his group-membership ought to let him write to it too. Other (non-owner) members of the group, will be allowed to both read-and-write the file.
The owner of a file can always add more permission to himself if he needs to - and sometimes a program may do it for you. For example, if you got a write-protected file (eg. permission r--r-----), some editors will allow you to write to them anyway (usually after a confirmation). The editor is running as you, and as you own the file and can change its permissions, the editor can remove write-protection and allow you to save the file.
+++
It means that it's the root-user who owns the file and got permission to both read and write it - the owner (root) may also change the file's permission. And that members of the root-group are allowed to read it. Other users can neither read, write nor execute the file. (Since it's a text-file, it's probably little point of executing it anyway.)
Many files on a Linux-system got root-user as it's owner and root-group as it's group. Although, traditionally various system-users and system-groups - like bin, sys, proc, operator - owned many files rather than root. For example, the binaries (the executable programs) usually had bin-user and/or bin-group as ownership (eg. bin:bin or root:bin).
The exception to this was executables that had to run as root - they had to be owned by the root-user. Usually programs execute as/with the permission of whomever user executed the program. If you run the command ls
, it runs with your permissions, and therefore cannot show directories you're not allowed to list (like the directories of other users). If a command is run with root-permission on the other hand, it got access to the whole system (which is why you don't want to that on most executables).
One good is example is the passwd
-command which lets you change password. This is run as root, and gives any user limited access to the files used to store the user-and password-databases.
rwsr-xr-x root:root /usr/bin/passwd
s=x+S, where x is execute-permission, and S is run as owner or run as group, depending if it's set for the owner or group triplet.
So root-user is the owner; and got read, write and execute permission. root-group is the group, and got read and execute permission. While other users also got read and execute permissions. In addition, the executable will run with the permissions of it's owner - ie. root - and not with the permissions of the user executing it (as is normal), thanks to the "s" (u+s) in the owner-triplet.
Another example, this time from BSD (a UNIX OS):
rws-x--- root:wheel /bin/su
This means the the executable su
is always run as it's owner - root. That root-user may read, write and execute it. That members of the wheel-group is allwoed to execute it, but not to read (eg. copy) it. And that other users may neither read, write nor execute it. (The command su
exists on Linux also, but here all users may execute it - it still runs as the root-user though.)
Other programs may also run as some system-user (and group) - for example the apache
web-server is often run as the www-data-user (and www-data-group). This way it can't do too much damage if compromised, due to lack of permissions where it doesn't belong.
Best Answer
If your threat model is binary — either the user account is not compromised or it is fully compromised (arbitrary command execution) — there isn't any difference. As you note, the attacker can just
chmod u+w
the file.Partial compromise, however, can happen. E.g., if the attacker only gains "write to any file", then he can't write to the 0400 file. This is of questionable benefit in a lot of cases though — "write to any file" can often be elevated to arbitrary command fairly easily (e.g., write to
~/.bashrc
).Further, with write permission on the parent directory, you can still delete (unlink) a mode 0400 file. Meaning an attacker could delete the existing file and put a new one in place using the same name.
The one "security" use that comes to mind is a FTP (etc.) dropbox. You could put a README file in there, mode 0400. If your dropbox parent directory is set to
a+rwxt
, then everyone can add new files there (and due to+t
) only delete files they own. So your README would be protected.So I think overall it's a prevent accidents feature more than a security feature.