Btrfs offers these commands to verify data integrity/checksums:
btrfs scrub start <path>|<device>
btrfs check --check-data-csum
However, AFAIK those always verify whole filesystems; the path
argument is to identify a filesystem on a device, not file/directory within filesystem.
Now, I have a 3TB Btrfs filesystem. Scrubbing it takes hours. Sometimes I need to make sure that only certain file/directory has not yet been affected by bitrot — for example, before using an *.iso installation image or restoring a backup. How do I use Btrfs for this — without falling back to keeping manual hash files per each file?
I am aware that Btrfs does not store checksums for individual files — it stores checksums for blocks of data. In this case what I am looking for is a command/tool that identifies all the blocks used for storing certain files/directories and verifies those blocks only.
I read somewhere that Btrfs allegedly verifies checksums on read. That is, if a file has been bit-rotted, reading it would fail or something like that. Is this the case?
Best Answer
The answer is: simply try reading the whole file. If it reads differently from what has been checksummed, there will be an Input/output error. So yes, Btrfs indeed verifies checksums on read!
To find out this answer, I put together the following test:
token1
);sed
-replacetoken1
withtoken2
;Here is the script:
As expected, replacing
mkfs.btrfs
with making a non-checksumming filesystem (e.g.mkfs.ext4
) allows for the corrupted file to be read. Of course, its sha256 is different from the non-corrupted one.