btrfs – Maximal Compression Possible with Btrfs

btrfscompression

I have been playing around with Btrfs. The maximal compression I have been able to achieve is 30:1 and that is on a file like:

yes foo | head -c 10G > file

Command line zstd will compress the file with a ratio of 10000:1, so I am somewhat disappointed by 30:1.

Obviously the file will compress a lot more if done by hand, but what is the maximal compression ratio that Btrfs can do? And which Btrfs compression algorithm is used and what does the file look like that compresses this well?

(Speed is not an issue).

Best Answer

https://lore.kernel.org/linux-btrfs/7b4cded9-01fa-4dff-8aaf-fcedc3b27562@gmx.com/ gets us close to an answer:

For compressed data, btrfs has a size limit for data extent, which is 128K. The number is to balance between compression ratio and extra decompression for CoWed extents.

On the other hand, btrfs (any fs) has its minimal block size, and it's 4K for x86_64.

So the upper limit you can get is 128K / 4K = 32.

A work around is to stack multiple compressing btrfses on top of each other:

#!/bin/bash

lvl1=$1
lvl2=$2
lvl3=$3

# Die on first error
set -e
rm -f btrfs-lvl1.img

# make level 1
truncate -s 1T btrfs-lvl1.img
mkfs.btrfs btrfs-lvl1.img
mkdir -p btrfs-lvl1
sudo mount -o compress=zstd:$lvl1 btrfs-lvl1.img btrfs-lvl1
sudo chown $(whoami) btrfs-lvl1

# make level 2
truncate -s 1T btrfs-lvl1/btrfs-lvl2.img
mkfs.btrfs btrfs-lvl1/btrfs-lvl2.img
mkdir -p btrfs-lvl2
sudo mount -o compress=zstd:$lvl2 btrfs-lvl1/btrfs-lvl2.img btrfs-lvl2
sudo chown $(whoami) btrfs-lvl2

# make level 3
truncate -s 1T btrfs-lvl2/btrfs-lvl3.img
mkfs.btrfs btrfs-lvl2/btrfs-lvl3.img
mkdir -p btrfs-lvl3
sudo mount -o compress=zstd:$lvl3 btrfs-lvl2/btrfs-lvl3.img btrfs-lvl3
sudo chown $(whoami) btrfs-lvl3

# Now use btrfs-lvl3/ for highly compressible data
head -c 10G /dev/zero > btrfs-lvl3/zero
du btrfs-lvl2/btrfs-lvl3.img btrfs-lvl1/btrfs-lvl2.img btrfs-lvl1.img

# Unmount (order is important)
sudo umount btrfs-lvl3 btrfs-lvl2 btrfs-lvl1

Compression ratio in layer 1: 24-33, layer 2: 9-17, layer 3: 0.7-1.8. Max total compression 854:1.

It makes sense that the ratio gets worse for each layer, and that at layer 3 the size even grows for some values (ratio < 1).

Running the script above for compression levels 1-9 for lvl1-3 shows these values as good values for lvl1-3 (the values vary somewhat from run to run): 1 1 1 (713:1), 9 8 5 (713:1), 7 3 6 (715:1), 4 4 8 (716:1), 1 1 2 (720:1), 2 1 4 (720:1), 5 6 7 (720:1), 4 4 1 (722:1), 1 8 3 (722:1), 2 7 5 (723:1), 7 8 9 (723:1), 1 5 1 (724:1), 1 6 8 (724:1), 8 9 2 (726:1), 9 2 4 (726:1), 7 8 1 (726:1), 2 9 9 (728:1), 7 7 9 (728:1), 9 9 4 (731:1), 2 6 6 (732:1), 1 6 3 (733:1), 7 5 5 (734:1), 3 4 5 (735:1), 2 5 1 (736:1), 3 6 2 (738:1), 7 8 6 (738:1), 1 6 6 (742:1), 1 8 1 (742:1), 6 4 7 (742:1), 9 8 9 (743:1), 9 5 2 (744:1), 1 3 5 (746:1), 8 3 5 (747:1), 4 1 5 (751:1), 8 6 2 (755:1), 5 9 6 (755:1), 9 8 6 (763:1), 8 1 5 (765:1), 2 9 2 (765:1), 1 8 5 (772:1), 7 6 3 (775:1), 1 9 6 (781:1), 7 7 6 (787:1), 8 9 8 (788:1), 3 9 2 (790:1), 4 2 8 (792:1), 7 4 7 (795:1), 4 9 6 (800:1), 6 5 8 (802:1), 7 7 5 (806:1), 1 5 6 (811:1), 5 6 9 (821:1), 3 5 2 (853:1), 7 5 8 (854:1)

I see no pattern in that, and 1 1 1 is both fast and easy to remember, so I will probably use that.

Related Question