Linux – How to view bad blocks on mounted ext3 filesystem

badblocksext3fscklinux

I've ran fsck -c on the (unmounted) partition in question a while ago. The process was unattended and results were not stored anywhere (except badblock inode).

Now I'd like to get badblock information to know if there are any problems with the harddrive.
Unfortunately, partition is used in the production system and can't be unmounted.

I see two ways to get what I want:

Run badblocks in read-only mode. This will probably take a lot of time and cause unnecessary bruden on the system.
Somehow extract information about badblocks from the filesystem iteself.

How can I view known badblocks registered in mounted filesystem?

Best Answer

Try

dumpe2fs -b /dev/<WHATEVER>

Related Solutions

Encrypted ext3 damaged; how to proceed

It sounds like the hard disk itself is having problems. ("short read," etc.) If so, dmesg | tail will probably show some I/O errors.

Another way to check this is to run badblocks -n on the problem partition. Or better, on the entire disk. Whatever you test, it needs to be unmounted. This will take hours on a large modern disk. If there's anything on the partition(s) that do mount that you can't live without, copy it off onto removable media or a network volume first.

The suggestion to mirror the disk is also good. It's kind of a "lite" version of the badblocks -n check, because by forcing the disk to read in every sector, it can cause the disk to relocate problem blocks, as badblocks -n will. badblocks -n is more effective because dodgy sectors can be barely-readable, and only be shown to the disk as bad enough to move by attempting to write to them. Still, if the disk has enough life left in it to survive a rescue, the extra read pass won't be enough to finish it off.

I don't hold much hope that running fsck on the disk image will recover everything. You'll almost certainly lose sectors in this process, which means some files will be unreadable or corrupted beyond use. A JPEG will partially decode with corrupted data, for example, but a JPEG with the bottom ⅔ cropped off might not be useful to you.

Is my data toasted?

Possibly, possibly not. The badblocks -n pass can sometimes fix the problem. If it does, you still need to replace the HDD, since a disk can only get into such a bad state by being nearly dead to start.

Did I do the wrong thing already?

Other than forgetting the meaning of the word "rigorous," no. :)

Btrfs – How to Track and Avoid Bad Blocks

Sadly, no.

btrfs doesn't track bad blocks and btrfs scrub doesn't prevent the next file from hitting the same bad block(s).

This btrfs mailing list post suggests to use ext4 with mkfs.ext4 -c (this "builds a bad blocks list and then won't use those sectors"). The suggestion to use btrfs over mdadm 3.1+ with RAID0 will not work.

It seems that LVM doesn't support badblock reallocation.

A work-around is to build a device excluding blocks known to be bad: btrfs over dmsetup.

The btrfs Project Ideas wiki says:

Not claimed — no patches yet — Not in kernel yet

Currently btrfs doesn't keep track of bad blocks, disk blocks that are very likely to lose data written to them. Btrfs should accept a list in badblocks' output format, store it in a new btree (or maybe in the current extent tree, with a new flag), relocate whatever data the blocks contain, and reserve these blocks so they can't be used for future allocations. Additionally, scrub could be taught to test for bad blocks when a checksum error is found. This would make scrub much more useful; checksum errors are generally caused by the disk, but while scrub detects afflicted files, which in a backup scenario gives the opportunity to recreate them, the next file to reuse the bad blocks will just start getting errors instead. These two items would match an ext4 feature (used through e2fsck).

Please comment if the status changes and I will update this answer.

Best Answer

Related Solutions

Encrypted ext3 damaged; how to proceed

Btrfs – How to Track and Avoid Bad Blocks

Related Question