Why can’t a regular user delete a btrfs subvolume

btrfs

Using a loop-mounted user created btrfs filesystem, with the permissions set correcly, a user is able to freely create btrfs subvolumes:

user@machine:~/btrfs/fs/snapshots$ /sbin/btrfs sub create newsubvol
Create subvolume './newsubvol'

However, trying to delete the newly created subvolume results in an error:

user@machine:~/btrfs/fs/snapshots$ /sbin/btrfs sub del newsubvol
Delete subvolume '/home/user/btrfs/fs/snapshots/newsubvol'
ERROR: cannot delete '/home/user/btrfs/fs/snapshots/newsubvol'

The root user, of course, is able to delete it:

root@machine:/home/user/btrfs/fs/snapshots# /sbin/btrfs sub del newsubvol
Delete subvolume '/home/user/btrfs/fs/snapshots/newsubvol'

This difference in behavior between the create and delete operations seems a bit strange. Can anyone shed some light on this?

Here is the exact sequence of commands:

user@machine:~$ dd if=/dev/zero of=btrfs_disk bs=1M count=100
100+0 records in
100+0 records out
104857600 bytes (105 MB) copied, 1.2345 s, 84.9 MB/s
user@machine:~$ mkdir mountpoint
user@machine:~$ /sbin/mkfs.btrfs btrfs_disk

WARNING! - Btrfs Btrfs v0.19 IS EXPERIMENTAL
WARNING! - see http://btrfs.wiki.kernel.org before using

SMALL VOLUME: forcing mixed metadata/data groups
Created a data/metadata chunk of size 8388608
fs created label (null) on btrfs_disk
    nodesize 4096 leafsize 4096 sectorsize 4096 size 100.00MB
Btrfs Btrfs v0.19
user@machine:~$ sudo mount btrfs_disk mountpoint/
user@machine:~$ cd mountpoint/
user@machine:~/mountpoint$ /sbin/btrfs sub create test
Create subvolume './test'
user@machine:~/mountpoint$ /sbin/btrfs sub delete test
Delete subvolume '/home/user/mountpoint/test'
ERROR: cannot delete '/home/user/mountpoint/test' - Operation not permitted

Here are the permissions:

user@machine:~/mountpoint$ ls -la
total 4
drwxr-xr-x 1 user user    8 Set  4 09:30 .
drwx------ 1 user user 4486 Set  4 09:29 ..
drwx------ 1 user user    0 Set  4 09:38 test

And the relevant line on df -T:

Filesystem              Type     1K-blocks      Used Available Use% Mounted on
/dev/loop0              btrfs       102400        32     98284   1% /home/user/mountpoint

The distro is a Debian Wheezy, 3.2.0-4-686-pae kernel, v0.19 btrfs-tools.
The situation still occurs on Ubuntu Saucy, 3.11.0-4-generic kernel, v0.20-rc1 btrfs-tools.

Best Answer

Well this was a learning experience for me but I eventually figured it out. I'll explain my process here so that it's easier to know how to figure this stuff out on your own (BTRFS documentation, as I'm sure you found out, is relatively incomplete for the time being).

At first I thought that creating the subvolume was an ioctl with a handler that didn't do any capability check (which may or may not have been a security issue depending on whether there was some logic to it) whereas deleting it was modifying the metadata directly (and thus the user might require CAP_SYS_RAWIO to work properly).

To verify, I cracked open the btrfs-utils source code and this is what I found:

Create subvolume, cmds-receive.c Line 180:
         ret = ioctl(r->dest_dir_fd, BTRFS_IOC_SUBVOL_CREATE, &args_v1);

Delete subvolume, cmds-subvolume.c Line 259:
         res = ioctl(fd, BTRFS_IOC_SNAP_DESTROY, &args);

Well, that ain't helpful, they're both ioctl's (interesting side note: "snapshot" is often used interchangably in the source code with "subvolume" for some reason). So I went to the kernel source code and found both handlers in fs/btrfs/ioctl.c.

Eventually, I traced it back to btrfs_ioctl_snap_destroy() and on line 2116 :

     if (!capable(CAP_SYS_ADMIN)){

Specifically, this is a check if they don't have the capability but if they do have it, the logic skips straight to performing the operation. The body of the if statement checks to see if it's regular user who's the owner of the subvolume's inode and the USER_SUBVOL_RM_ALLOWED BTRFS option is enable it continues executing the handler. If they don't have either the ioctl handler exits with an error.

So it looks like destroying a "snapshot" (aka "subvolume") generally requires a user that has CAP_SYS_ADMIN (or for USER_SUBVOL_RM_ALLOWED to be enabled and the user "owns" the given subvolume). Great, what about creating a snapshot/volume?

The handler for the ioctl appears to be btrfs_ioctl_snap_create() this handler appears to contain no call to capable() directly or indirectly. Since that's the main way access is brokered I'm taking this to mean that subvolume creation always succeeds. This explains at a functional level why you're seeing the what you're seeing.

I can't speak to why this is considered desirable outside of BTRFS's main use case being with a server with restricted user access. That's not sufficient but I'm not seeing any code to actually stop the operation. If you can't find an answer to why that is (and you care to have it) you may have to ask on the kernel mailing list.

Conclusion

My research seems to indicate that anyone can create subvolumes but in order to delete a subvolume you either need to have CAP_SYS_ADMIN or it needs true both that the calling user is the owner of the subvolume inode and USER_SUBVOL_RM_ALLOWED enabled.

The subvolume creation doesn't make sense so I'm probably missing some indirect way that the operation is denied since that seems like an easy way to DoS a system.

Note: I'm not in a place where I can verify this functionality but once I get home I can set if setcap magic works how this predicts.

Related Question