Data Recovery – How to Find Lost Files After a ddrescue Attempt

data-recoveryddrescuehard-disk

I am in the process of salvaging data from a 1 TB failing drive (asked about it in Procedure to replace a hard disk?). I have done ddrescue from a system rescue USB with a resulting error size of 557568 B in 191 errors, probably all in /home (I assume what it calls "errors" are not bad sectors, but consecutive sequences of them).

Now, the several guides I've seen around suggest doing e2fsck on the new disk, and I expected this to somehow find that some files have been assigned "blank sectors/blocks", to the effect of at least knowing which files could not be saved whole. But no errors were found at all (I ran it without -y to make sure I didn't miss anything). Now I am running it again with -c, but at 95% no errors were found so far; I guess I have a new drive with some normal-looking files with zeroed or random pieces inside, undetectable until on day I open them with the corresponding software, or Linux Mint needs them.

Can I do anything with the old/new drives in order to obtain a list of possibly corrupted files? I don't know how many they could be, since that 191 could go across files, but at least the total size is not big; I am mostly concerned about a big bunch old family photos and videos (1+ MB each), the rest is probably irrelevant or was backed up recently.

Update: the new pass of e2fsck did give something new of which I understand nothing:

Block bitmap differences:  +231216947 +(231216964--231216965) +231216970 +231217707 +231217852 +(231217870--231217871) +231218486
Fix<y>? yes
Free blocks count wrong for group #7056 (497, counted=488).                    
Fix<y>? yes
Free blocks count wrong (44259598, counted=44259589).
Fix<y>? yes

Best Answer

You'll need the block numbers of all encountered bad blocks (ddrescue should have given you a list, I hope you saved it), and then you'll need to find out which files make use of these blocks (see e.g. here). You may want to script this if there are a lot of bad blocks.

e2fsck doesn't help, it just checks consistency of the file system itself, so it will only act of the bad blocks contain "adminstrative" file system information.

The bad blocks in the files will just be empty.

Edit

Ok, let's figure out the block size thingy. Let's make a trial filesystem with 512-byte device blocks:

$ dd if=/dev/zero of=fs bs=512 count=200
$ /sbin/mke2fs fs

$ ll fs
-rw-r--r-- 1 dirk dirk 102400 Apr 27 10:03 fs

$ /sbin/tune2fs -l fs
...
Block count:              100
...
Block size:               1024
Fragment size:            1024
Blocks per group:         8192
Fragments per group:      8192

So the filesystem block size is 1024, and we've 100 of those filesystem blocks (and 200 512-byte device blocks). Rescue it:

$ ddrescue -b512 fs fs.new fs.log
GNU ddrescue 1.19
Press Ctrl-C to interrupt
rescued:    102400 B,  errsize:       0 B,  current rate:     102 kB/s
   ipos:     65536 B,   errors:       0,    average rate:     102 kB/s
   opos:     65536 B, run time:       1 s,  successful read:       0 s ago
Finished                                     

$ cat fs.log
# Rescue Logfile. Created by GNU ddrescue version 1.19
# Command line: ddrescue fs fs.new fs.log
# Start time:   2017-04-27 10:04:03
# Current time: 2017-04-27 10:04:03
# Finished
# current_pos  current_status
0x00010000     +
#      pos        size  status
0x00000000  0x00019000  +

$ printf "%i\n" 0x00019000
102400

So the hex ddrescue units are in bytes, not any blocks. Finally, let's see what debugfs uses. First, make a file and find its contents:

$ sudo mount -o loop fs /mnt/tmp
$ sudo chmod go+rwx /mnt/tmp/
$ echo 'abcdefghijk' > /mnt/tmp/foo
$ sudo umount /mnt/tmp

$ hexdump -C fs
...
00005400  61 62 63 64 65 66 67 68  69 6a 6b 0a 00 00 00 00  |abcdefghijk.....|
00005410  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|

So the byte address of the data is 0x5400. Convert this to 1024-byte filesystem blocks:

$ printf "%i\n" 0x5400
21504
$ expr 21504 / 1024
21

and let's also try the block range while we are at it:

$ /sbin/debugfs fs
debugfs 1.43.3 (04-Sep-2016)
debugfs:  testb 0
testb: Invalid block number 0
debugfs:  testb 1
Block 1 marked in use
debugfs:  testb 99
Block 99 not in use
debugfs:  testb 100
Illegal block number passed to ext2fs_test_block_bitmap #100 for block bitmap for fs
Block 100 not in use
debugfs:  testb 21
Block 21 marked in use
debugfs:  icheck 21
Block   Inode number
21      12
debugfs:  ncheck 12
Inode   Pathname
12      //foo

So that works out as expected, except block 0 is invalid, probably because the file system metadata is there. So, for your byte address 0x30F8A71000 from ddrescue, assuming you worked on the whole disk and not a partition, we subtract the byte address of the partition start

210330128384 - 7815168 * 512 = 206328762368

Divide that by the tune2fs block size to get the filesystem block (note that since multiple physical, possibly damaged, blocks make up a filesystem block, numbers needn't be exact multiples):

206328762368 / 4096 = 50373233.0

and that's the block you should test with debugfs.

Related Question