Lum – ZFS silent corruption

dataillumossolariszfs

I am trying to find answers to some questions on how ZFS works:

  • does it detect silent corruption via checksums as soon as data is changed (and differs from checksum), automatic in a way (then if there's RAIDZ 1, it would repair by fetching from mirrored disk), OR this works only when accessing corrupted file(during read, and scrubbing of course)?
  • I am now confused about traditional hardware RAID now – can it detect silent corruption with same certainty as ZFS, and location of corruption as well, and if yes – is it able to do repair as ZFS also?

Just need some more precision in explanation on how this works.

Thanks.

Best Answer

Checksum verification happens on reads, and to read everything (except free space), you can scrub regularly. For Software RAID (mdadm), you can run --action=check and then see if mismatch_cnt is still 0.

RAID only attempts to fix read errors (by re-writing data); for mismatched data, you have to determine manually whether it's relevant (free space or not) and if data or parity is correct.

Essentially with RAID you trust the storage to not misbehave and report errors properly instead of silently returning false data. RAID does not have checksums nor does it verify parity for every read.

Related Question