With Software RAID, you don't have to use whole disks.
If you have 3x2TB and 3x1TB, and planning to replace the 1TB with 2TB in the future, you could use 1TB members. So that's RAID5 (or if you prefer RAID6) over 6x1TB, and RAID5 over 3x1TB. So the 2TB will be shared by both RAIDs.
When you kick out an 1TB and add a 2TB instead, then one RAID will see a replacement, and the other will have the remaining 1TB added as new member.
When you're using ext4, you can check for badblocks with the command e2fsck -c /dev/sda1
or whatever. This will "blacklist" the blocks by adding them to the bad block inode.
e2fsck -c
runs badblocks
on the underlying hard disk. You can use the badblocks
command directly on a LVM physical volume (assuming that the PV is in fact a hard disk, and not some other kind of virtual device like an MD software RAID device), just as you would use that command on a hard disk that contains an ext file system.
That won't add any kind of bad block information to the file system, but I don't really think that that's a useful feature of the file system; the hard disk is supposed to handle bad blocks.
Even better than badblocks
is running a SMART selftest on the disk (replace /dev/sdX
with the device name of your hard disk):
smartctl -t long /dev/sdX
smartctl -a /dev/sdX | less
The test ifself will take a few hours (it will tell you exactly how long). When it's done, you can query the result with smartctl -a
, look for the self-test log. If it says "Completed successfully", your hard disk is fine.
In other words, how can I check for bad blocks to not use in LVM?
As I said, the hard disk itself will ensure that it doesn't use damaged blocks and it will also relocate data from those blocks; that's not something that the file system or the LV has to do. On the other hand, when your hard disk has more than just a few bad blocks, you don't want something that relocates them, but you want to replace the whole hard disk because it is failing.
Best Answer
In general, as has been mentioned in a comment here and in the mailing list thread you linked to, modern hard drives which are so far gone they’ve got unreplaceable bad blocks should just be discarded. (You’ve explained why you’re interested in this, but it’s worth noting for other readers.)
I don’t think there’s anything in LVM to avoid bad blocks as such; typically you’d address that below LVM, at the device layer. One way of dealing with the problem is to use device mapper: create a table giving the sector mapping required to skip all bad blocks, and build a device using that. Such a table would look something like
etc. (this creates a 196-sector device, using
/dev/sda
but skipping sector 98). You give this todmsetup
:and then create a PV on the resulting
/dev/nobbsda
device (instead of/dev/sda
).Using this method, with a little forward-planning you can even handle failing sectors in the future, in the same way as a drive’s firmware: leave some sectors at the end of the drive free (or even dotted around the drive, if you want to spread the risk), and then use them to fill holes left by failing sectors. Using the above example, if we consider sectors starting from say 200 to be spare sectors, and sector 57 becomes bad:
Creating a device-mapper table using a list of bad sectors as given by
badblocks
is left as an exercise for the reader.Another solution that would work with an existing LVM setup would be use
pvmove
’s ability to move physical extents in order to move LVs out of bad areas. But that wouldn’t prevent those areas from being re-used whenever a new LV is created or an existing LV resized or moved.