Centos – What are data integrity / bit-rot protection options on CentOS 7

centosdataintegritymdzfs

I have a 2 disk CentOS 7 machine build that I need data integrity / bitrot protection on. How can I achieve this?

Note from my reading btrfs,zfs and DM-Integrity does not seem to be options.

  • Btrfs is not an option as btrfs will be deprecated by RHEL and CentOS.
  • ZFS is not natively supported on RHEL/CentOS and RH has not intention of supporting it in the future. Also the data corruption bug withing ZFSonLinux in Apr 2018 does not bode well for that implementation.
  • DM-Integrity is not an option as the kernel versions are older and as far as I know are not available on CentOS.
  • It seem RAID6 using md (on 4 partitions) is not option due to the fact AFAIK it does not calculate the checksum on each read. According to this answer, a scrub may not correctly fix anyway.

Note CentOS was chosen for stability and long term support.

Best Answer

mdadm RAID does not calculate (slow) and does not correct (reliably) but it can be used to detect (mismatch_cnt != 0 after check). If you do use mdadm (for other reasons) and do run obligatory checks (for obvious reasons), make it include the mismatch_cnt in the mail report. (And don't forget about SMART monitoring, either, and don't wait to replace drives with bad or reallocated sectors...)

That way, if there is bitrot going on any individual disk, you would at least get some notification about it. I've monitored my RAID like that for years and it never happened (other than when provoked to test the functionality itself).

As a result, I don't think bitrot is a common issue (on a hardware level).

Each drive uses checksums internally, that's how it detects read errors. If the drive gets the wrong data on a read, it won't return it, it'll report an error instead. And usually that's good enough.

Then there's a special kind of bitrot that no filesystem will help you with, either. It's when software writes corrupt data in the first place. Like the braindead photo manager that changes exif data of every picture it finds. The files will be corrupted but the checksumming anti-bitrot filesystem will happily tell you: yeah that's the data I was told to write and the checksum checks out, what about it?

At that point, you need a backup system that checksums files and detects changed files and does not remove/replace intact data with new/corrupt versions of the same data, so you can go back to the intact ones. And it would be grand if it could send you a report about changed files and you would notice if your entire picture collection was in it even though you can't remember changing them.

Heck, some things should be mounted read-only by default... but no one does it because it's a hassle.

Related Question