Btrfs – missing space, what is taking it up

btrfs

I have a Netgear NAS device which uses btrfs filesystem.

du reports all the files on md device taking up 1.37 TB.

btrfs fi df says, that it's taking up 1.53 TB.

I cannot figure out where does 0.16 TB go. On Netgear forums metadata has been mentioned, however btrfs fi df outputs metadata as a separate line, and only taking up 2GB.

How can I figure this out?

Best Answer

This situation can be caused by file fragmentation. You can try to solve it by going to the terminal and then type:
sudo umount /dev/sdxy
and then
sudo btrfs filesystem defrag /dev/sdxy
In these commands sdxy is the correct designation of the partition with the problem. For example it might be something like this: 'sda5'. Make sure you check the correct designation before you try anything.

Related Solutions

Linux – How to figure out what’s freezing up the machine

From the btrfs gotchas page:

Files with a lot of random writes can become heavily fragmented (10000+ extents) causing trashing on HDDs and excessive multi-second spikes of CPU load on systems with an SSD or large amount a RAM.

On servers and workstations this affects databases and virtual machine images.

The nodatacow mount option may be of use here, with associated gotchas.

...

Symptoms include btrfs-transacti and btrfs-endio-wri taking up a lot of CPU time (in spikes, possibly triggered by syncs). You can use filefrag to locate heavily fragmented files (may not work correctly with compression).

I had similar problems as you describe with Virtualbox. The nodatacow option for btrfs did not help in a noticeable way on my system. I tried the auto-defragment option (mentioned as a possible solution for application databases in desktop environments) as well, also without results that would make the behaviour acceptable.

In the end I shrunk my btrfs partion and the Logical Volume it lives in, I created a new LV and formatted it as ext4, and then put the VM disc images that I have (VirtualBox) on that "partition".

Tiered storage with BTRFS – how is it done

OK, here's what I found happening during the periodic balances:

The following process is started on the host:

btrfs balance start -dsweep lt:/dev/md127:7 /data LANG=en_US.UTF-8 PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin DBUS_SESSION_BUS_ADDRESS=unix:path=/var/netatalk/spotlight.ipc TRACKER_USE_CONFIG_FILES=1 TRACKER_USE_LOG_FILES=1 XDG_DATA_HOME=/apps/.xdg/local/share XDG_CONFIG_HOME=/apps/.xdg/config XDG_CACHE_HOME=/apps/.xdg/cache

where /data is my tiered data-volume, /dev/md127 is the SSD array used as buffer/cache.

This process runs until the data from the SSD tier is moved almost completely to the HDD tier - e.g. somewhere along the way I see:

btrfs fi sh /data
Label: '0a44c6bc:data'  uuid: ed150b8f-c986-46d0-ada8-45ee219acbac
    Total devices 2 FS bytes used 393.14GiB
    devid    1 size 7.12TiB used 359.00GiB path /dev/md126
    devid    2 size 114.68GiB used 42.06GiB path /dev/md127

and then it goes down until the usage of the SSD tier goes almost to zero. The strange thing is that so far i was not able to run this command manually.

I still cannot figure out the 'sweep' balance filter.

This is what the -help shows:

# btrfs balance start --help
usage: btrfs balance start [options] <path>

    Balance chunks across the devices

    Balance and/or convert (change allocation profile of) chunks that
    passed all filters in a comma-separated list of filters for a
    particular chunk type.  If filter list is not given balance all
    chunks of that type.  In case none of the -d, -m or -s options is
    given balance all chunks in a filesystem. This is potentially
    long operation and the user is warned before this start, with
    a delay to stop it.

    -d[filters]    act on data chunks
    -m[filters]    act on metadata chunks
    -s[filters]    act on system chunks (only under -f)
    -v             be verbose
    -f             force reducing of metadata integrity
    --full-balance do not print warning and do not delay start
    --background|--bg
                   run the balance as a background process

but this does not explain how it maps to the "lt:/dev/md127:7" part of the command that runs periodically:

btrfs balance start -dsweep lt:/dev/md127:7 /data

What's the meaning here: Run until the /dev/md127 data usage falls below 7% ?!?

Best Answer

Related Solutions

Linux – How to figure out what’s freezing up the machine

Tiered storage with BTRFS – how is it done

Related Question