When you're using ext4, you can check for badblocks with the command e2fsck -c /dev/sda1
or whatever. This will "blacklist" the blocks by adding them to the bad block inode.
e2fsck -c
runs badblocks
on the underlying hard disk. You can use the badblocks
command directly on a LVM physical volume (assuming that the PV is in fact a hard disk, and not some other kind of virtual device like an MD software RAID device), just as you would use that command on a hard disk that contains an ext file system.
That won't add any kind of bad block information to the file system, but I don't really think that that's a useful feature of the file system; the hard disk is supposed to handle bad blocks.
Even better than badblocks
is running a SMART selftest on the disk (replace /dev/sdX
with the device name of your hard disk):
smartctl -t long /dev/sdX
smartctl -a /dev/sdX | less
The test ifself will take a few hours (it will tell you exactly how long). When it's done, you can query the result with smartctl -a
, look for the self-test log. If it says "Completed successfully", your hard disk is fine.
In other words, how can I check for bad blocks to not use in LVM?
As I said, the hard disk itself will ensure that it doesn't use damaged blocks and it will also relocate data from those blocks; that's not something that the file system or the LV has to do. On the other hand, when your hard disk has more than just a few bad blocks, you don't want something that relocates them, but you want to replace the whole hard disk because it is failing.
It probably can, but that won't help you due to how flash media work.
In contrast to a hard disk which can write or erase individual bits, while a flash medium can write individual bits it can only erase them a whole erase block at a time. The size of an erase block can differ, but it's often something like 128k. Since that's a lot to erase and rewrite if we only want to change one 'sector' (the size unit with which hard disks and operating systems deal), the thumb drive will split the erase block up into sector-sized units. When you change something it will mark the sector on which you've just changed something as "no longer in use", and then write the modified version somewhere else. After a while, it will see that the erase block has no active sectors anymore, and erase the block.
What this means is that if one sector is broken, the next time you write to that sector it will not be broken anymore, since it will now be a different sector.
In addition, flash tends to wear out after a number of write cycles, at which point it will fail (the exact number differs based on the quality of the flash chips, but is rarely less than something like 100000). For this purpose as well as for the extra space needed for the erase block stuff, a thumb drive has some extra capacity that is not announced; e.g., a 4g thumbdrive might expose 4000M but have 4096M internally, or 4200M, or some such. When a particular erase block starts to fail after too many write/erase cycles, your thumbdrive will mark it as such and no longer use it. It can do this for a while, but eventually the extra space will have been used up; at this point, when it tries to copy a sector to make a requested change, it will not find an empty sector anymore and can only produce a write error.
When your thumb drive reaches that point, as yours seems to have, it's time to replace it; it won't be long now before you'll start losing data (if that hasn't already happened)
Best Answer
Sadly, no.
btrfs doesn't track bad blocks and
btrfs scrub
doesn't prevent the next file from hitting the same bad block(s).This btrfs mailing list post suggests to use ext4 with
mkfs.ext4 -c
(this "builds a bad blocks list and then won't use those sectors"). The suggestion to use btrfs over mdadm 3.1+ with RAID0 will not work.It seems that LVM doesn't support badblock reallocation.
A work-around is to build a device excluding blocks known to be bad: btrfs over dmsetup.
The btrfs Project Ideas wiki says:
Please comment if the status changes and I will update this answer.