How much space would be freed by removing a btrfs subvolume

btrfs

Is there any way to calculate how much space I would free if I would remove one (or several) subvolumes on a Btrfs disk (without actually removing them)? I know that there is "currently no code that will do the calculation for you", but how would you do it?

I also wonder why they are saying that it would be so slow? Both actually removing a subvolume and asking about free space is very fast in my experience, why would doing the same thing hypothetically be so much slower?

Best Answer

You should take a look at btrfs quota and btrfs qgroups (quota groups).

Basically qgroups do exactly what you requested, they track how much space is allocated by subvolumes. To enable qgroup functionality for a btrfs filesystem you have to

# btrfs quota enable /path/to/btrfs/filesystem

However, before you do this be warned that this triggers a complete re-computation of the qgroup data which will take some time especially for large filesystems with many subvolumes. This process runs asynchronously in the background. You can already check the status of the qgroups with

# btrfs qgroup show /path/to/btrfs/filesystem

This will give you some output like this:

WARNING: rescan is running, qgroup data may be incorrect
qgroupid         rfer         excl
--------         ----         ----
0/5         843.69GiB     61.91MiB
0/4881      811.06GiB      9.34GiB
0/7990      867.32GiB    329.91MiB
0/8400      867.17GiB     37.64MiB

(The warning in the first line is present as long as the rescan is still running.)

Btrfs automatically creates a qgroup for each subvolume. In this case there are three subvolumes with subvolume IDs 4881, 7990, and 8400. The part before the forward slash is the level of the qgroup. Each subvolume qgroup is on level 0. Additionally there is a special qgroup on level 0 that always has ID 5 and corresponds to the root of the btrfs filesystem.

For each qgroup the above output shows how much space is referenced by it. That means that the corresponding subvolume contains files whose total size equals the shown number.

However, due to snapshots and the copy-on-write nature of btrfs subvolumes may share files. This means that the content (or actually the extents) of files may be referenced by more than one subvolume. This is expressed by the second number which shows how much space is exclusively allocated by each subvolume and is not shared with any other subvolume. In case you delete a subvolume this is the space that will actually be freed.

If you want to find out how much space would be freed if you delete multiple subvolumes, you can use the aforementioned levels. qgroups are organized in a hierachy and groups on upper levels (higher than 0) aggregate the information of lower levels.

Thus, to find out how much space would be freed if subvolumes 4881 and 7990 (in the above example) would be deleted create a new qgroup (arbitrarily with ID 0, but you may choose whatever you like here) on level 1 with

# btrfs qgroup create 1/0 /path/to/btrfs/filesystem

Then assign the newly created qgroup as a parent to the qgroups of the subvolumes you want to delete with

# btrfs qgroup assign 0/4881 1/0 /path/to/btrfs/filesystem
# btrfs qgroup assign 0/7990 1/0 /path/to/btrfs/filesystem

This will trigger another re-scan of the quota information which may take a while. If it is finished and you now issue

# btrfs qgroup show -p /path/to/btrfs/filesystem

you get an output like this:

qgroupid         rfer         excl parent
--------         ----         ---- ------
0/5           1.38TiB      2.51GiB ---
0/4881        1.11TiB     10.86GiB 1/0
0/7990        1.23TiB    502.41MiB 1/0
0/8400        1.34TiB      1.69GiB 1/0
1/0           1.51TiB    132.23GiB ---

(I added the -p flag to add the parent column to the output which shows the parent/child relationship of the qgroups.)

Now the line with qgroup 1/0 tells you how much space is referenced by both subvolumes you want to delete and, more importantly, it tells you how much space is allocated by them exclusively. This is the amount of space that will be freed if you delete both subvolumes.

I also wonder why they are saying that it would be so slow?

This is due to the copy-on-write nature of btrfs together with snapshots. If you create a snapshot in btrfs (normally) all actual data in the newly created subvolume that contains the snapshot is shared with the source of the snapshot. Only when a file is changed or replaced in the source does it point to different content (extents). This makes it very difficult to assess how much space would actually be freed if a subvolume is deleted because you have to account for all the space that is shared with other subvolumes.

Related Question