Linux – Why is fsck saying corrupt superblock or partition table and how to fix

data-recoveryfilesystem-corruptionfilesystemslinux

I made an image (dd) of a drive and I am trying to run a filesystem check on it:
Filesystem Type: ext3

Here is the Original error from fsck:

 fsck -fv -z ./Seagate.ST3500320NS.SN-9QM5ZHHR.500GB.465GiB.undo.$(date
+"%Y-%m-%d.%H.%M.%S").und /dev/loop2
fsck from util-linux 2.29.2
e2fsck 1.43.4 (31-Jan-2017)
Overwriting existing filesystem; this can be undone using the command:
    e2undo ./Seagate.ST3500320NS.SN-9QM5ZHHR.500GB.465GiB.undo.2019-01-17.13.31.41.und /dev/loop2

The filesystem size (according to the superblock) is 122063840 blocks
The physical size of the device is 121604515 blocks
Either the superblock or the partition table is likely to be corrupt!

Info from fdisk -l /dev/sda

Disk /dev/sda: 465.8 GiB, 500107862016 bytes, 976773168 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0x000794ac

Device     Boot  Start       End   Sectors   Size Id Type
/dev/sda1  *        32      8191      8160     4M  4 FAT16 <32M
/dev/sda2       262144 976773119 976510976 465.7G 83 Linux

Info from fdisk -l ./Seagate.ST3500320NS.SN-9QM5ZHHR.500GB.465GiB.img

Disk ./Seagate.ST3500320NS.SN-9QM5ZHHR.500GB.465GiB.img: 464 GiB, 498226311168 bytes, 973098264 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0x000794ac

Device                                              Boot  Start       End   Sectors   Size Id Type
./Seagate.ST3500320NS.SN-9QM5ZHHR.500GB.465GiB.img1 *        32      8191      8160     4M  4 FAT16 <32M
./Seagate.ST3500320NS.SN-9QM5ZHHR.500GB.465GiB.img2      262144 976773119 976510976 465.7G 83 Linux

I made a loop device for the Partition Using:

losetup --offset $((512*262144)) /dev/loop2 ./Seagate.ST3500320NS.SN-9QM5ZHHR.500GB.465GiB.img

From blockdev –getbsz /dev/loop2

4096

From blockdev –getsz /dev/loop2

972836120

From dumpe2fs /dev/loop2:

Filesystem UUID:          f68ccb5a-bcfa-4e8a-8876-45adaa6e6b85
Filesystem magic number:  0xEF53
Filesystem revision #:    1 (dynamic)
Filesystem features:      has_journal ext_attr resize_inode dir_index filetype sparse_super large_file
Filesystem flags:         signed_directory_hash
Default mount options:    user_xattr acl
Filesystem state:         clean with errors
Errors behavior:          Continue
Filesystem OS type:       Linux
Inode count:              30523392
Block count:              122063840
Reserved block count:     6103192
Free blocks:              96939245
Free inodes:              30462657
First block:              0
Block size:               4096
Fragment size:            4096
Reserved GDT blocks:      994
Blocks per group:         32768
Fragments per group:      32768
Inodes per group:         8192
Inode blocks per group:   512
Filesystem created:       Sat Apr 26 21:28:22 2014
Last mount time:          Wed Jan 16 15:59:22 2019
Last write time:          Thu Jan 17 18:16:50 2019
Mount count:              17
Maximum mount count:      -1
Last checked:             Sat Apr 26 21:28:22 2014
Check interval:           0 (<none>)
Lifetime writes:          10 MB
Reserved blocks uid:      0 (user root)
Reserved blocks gid:      0 (group root)
First inode:              11
Inode size:               256
Required extra isize:     28
Desired extra isize:      28
Journal inode:            8
Default directory hash:   half_md4
Directory Hash Seed:      162d0daa-7968-48f9-8370-f095c9e19f58
Journal backup:           inode blocks
Journal features:         journal_incompat_revoke
Journal size:             128M
Journal length:           32768
Journal sequence:         0x000059bd
Journal start:            0

Followed by a lot of:

Group 0: (Blocks 0-32767)
  Primary superblock at 0, Group descriptors at 1-30
  Reserved GDT blocks at 31-1024
  Block bitmap at 1025 (+1025)
  Inode bitmap at 1026 (+1026)
  Inode table at 1027-1538 (+1027)
  4 free blocks, 8179 free inodes, 2 directories
...
(SKIPPING TO END)
...
Group 3725: (Blocks 122060800-122063839)
  Block bitmap at 122060800 (+0)
  Inode bitmap at 122060801 (+1)
  Inode table at 122060802-122061313 (+2)
  0 free blocks, 8192 free inodes, 0 directories

Finished with:

dumpe2fs: /dev/loop2: error reading bitmaps: Can't read a block bitmap

Now I can mount /dev/sda2 just fine and read the files
I cannot however mount /dev/loop2

mount -t ext3 /dev/loop2 ./DriveImage/
mount: wrong fs type, bad option, bad superblock on /dev/loop2,
       missing codepage or helper program, or other error

       In some cases useful info is found in syslog - try
       dmesg | tail or so.

I get the same error trying to mount directly from the image using:

mount -o loop,offset=$((512*262144)) ./Seagate.ST3500320NS.SN-9QM5ZHHR.500GB.465GiB.img ./DriveImage

Now According to dumpe2fs the superblock is right!
And According to the math:

 Superblock says:
 122063840
 Filesystem says:
 121604515

 block size:
 4096

 Math: Sectors * Sector Size = Size / Block Size = Blocks
 Partition 1: 
 8160 * 512 = 4177920 / 4096 = 1020
 Partition 2:
 [From fdisk]
 976510976 * 512 = 499973619712 / 4096 = 122063872
 [From blockdev with /dev/loop2]
 972836120 * 512 = 498092093440 / 4096 = 121604515

fdisk is reporting pretty close to the right blocks…(Just 32 more blocks)
But in my book, fsck is getting its blocksize info the same way blockdev does (or uses it), but according to dumpe2fs and checking the actual partition table, the superblock is actually right and so is the partition table.

In fear of losing the original data on the disk (10 years worth of family pictures/videos and important files) I am not willing to run this stuff on the original disk. So I made a copy of the disk to this image, then also, in case I screw up something, I also made a copy of the image. (Don't worry, I have disk space for this).

What am I doing wrong here? and how can I fix this?

NOTE: Due to the old drive starting to fail (assumed, I have had some issues), this is why I am doing this.

ALSO, for some reason the drive lost its partition table and I had to use testdisk to recover it. After it was recovered I was able to mount the large partition and read all my data.
SO I assumed that either testdisk got it right or pretty darn close, since it is all there.

(UPDATE #1) I should also note, that when I run fsck on the original drive I do not get this error…

fsck -nfv /dev/sda2
fsck from util-linux 2.29.2
e2fsck 1.43.4 (31-Jan-2017)
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information

       60735 inodes used (0.20%, out of 30523392)
        1510 non-contiguous files (2.5%)
          49 non-contiguous directories (0.1%)
             # of inodes with ind/dind/tind blocks: 23779/2425/0
    25124595 blocks used (20.58%, out of 122063840)
           0 bad blocks
           1 large file

       56022 regular files
        4704 directories
           0 character device files
           0 block device files
           0 fifos
           0 links
           0 symbolic links (0 fast symbolic links)
           0 sockets
------------
       60726 files

(UPDATE #2) I found out that the image file is not the same as the drive, the image file is smaller.
Drive Size: 500107862016 (Had a mistake here, I only got the size of the second partition, updated to correct info)
Image Size: 498226311168
The Image file is missing 1881550848 bytes, over 1.88GB of data. (This corrected also)
Seems like dd did not get everything, and that I might be right that the drive has problems, is there a way to make dd fill read-errors with blank space so I can have a matching size?

I am running through fsck on the loop device to see what it does, I guess if I mess it up, I will just restore the back image.

ANOTHER IMPORTANT NOTE: This is a headless server system, no GUI, just CLI.

Best Answer

Errors from a bad disk image

Sounds like you're seeing errors in the image cause by a failing drive. I'd use gddrescue instead, it attempts to handle read errors.

Gddrescue's manual is informative, it's section 10 A small tutorial with examples starts with

Example 1: Fully automatic rescue of a whole disc with two ext2 partitions in /dev/sda to /dev/sdb.
Note: you don't need to partition /dev/sdb beforehand, but if the partition table on /dev/sda is damaged, you'll need to recreate it somehow on /dev/sdb.

ddrescue -f -r3 /dev/sda /dev/sdb mapfile
fdisk /dev/sdb
e2fsck -v -f /dev/sdb1
e2fsck -v -f /dev/sdb2

Instead of rescuing directly to a device (/dev/sdb), using a file works. And instead of starting with -r 3 to retry bad sectors 3 times, maybe use the default (0) and -n / --no-scrape to "Skip the scraping phase" to get as much as you can quickly first.

There's also a ddrescueview package that gives a graphic view of a gddrescue map file, that might be interesting:

enter image description here

And monitoring syslog or dmesg should have shown read errors earlier, I'd monitor them while using that drive.

Are there that many important files to backup?

If the files are still readable, and especially if the important files you want to backup are much smaller than the entire drive, just copy those files only and forget about the entire drive image. OS's are easy to reinstall.

Mounting full disk images

Looks like losetup -P creates appropriate partition-loop devices itself, or maybe partprobe or gnome-disk-image-mounter, in addition to kpartx.

Related Question