Data Recovery – Find if ext4 Block is in Inode Table and Extract from Journal

data-recoveryddrescueext4journaling

So en route from my old laptop to a new one my old laptop's hard drive got some physical damage. badblocks reports 64 bad sectors. I had a two-month-old Ubuntu GNOME setup with a split / and /home partitions. From what I can tell, a few sectors in / were damaged, but that's not an issue. On the other hand, /home's partition gives me this annotated ddrescue log:

# Rescue Logfile. Created by GNU ddrescue version 1.17
# Command line: ddrescue -d -r -1 /dev/sdb2 home.img home.log
# current_pos  current_status
0x6788008400     -
#      pos        size  status
0x00000000  0x6788000000  +
0x6788000000  0x0000A000  -
    first 10 sectors of the ext4 journal
0x678800A000  0x2378016000  +
0x8B00020000  0x00001000  -
    inode table entries for /pietro (my $HOME) and a few folders within
0x8B00021000  0x00006000  +
0x8B00027000  0x00001000  -
    unknown (inode table?)
0x8B00028000  0x00004000  +
0x8B0002C000  0x00001000  -
    unknown (inode table?)
0x8B0002D000  0x001DC000  +
0x8B00209000  0x00001000  -
    unknown (inode table?)
0x8B0020A000  0x00090000  +
0x8B0029A000  0x00001000  -
    unknown (inode table?)
0x8B0029B000  0x4420E65000  +

I made the annotations with use of debugfs's icheck and testb commands; all the damaged blocks are marked used. Some superblock stats:

Block size:               4096
Fragment size:            4096
Reserved GDT blocks:      972
Blocks per group:         32768
Fragments per group:      32768
Inodes per group:         8192
Inode blocks per group:   512

So my questions are:

  1. Can I find out exactly what those five unknown blocks were, if not inode entries? My suspicion is that they are inode table entries, but icheck doesn't want to say. If they are, can I find out which inodes?
  2. Can I still recover these inode table entries from the journal by hand, even though the first 10 blocks of the journal are lost?

I'd rather not do this data recovery with fsck, which will just dump all my files i n /lost+found in a giant mess of flattened directory structure and duplicate files…

Thanks.

Best Answer

All right, so for the first question it turns out the debugfs stats command tells what the starting blocks for every section of a group are. In addition, I guessed that inumbers had to be consecutive and increasing, so basic addition of the offset into the inode table and the imap command gave me the first inumbers; it also confirmed my suspicion about the last bad sector, where my block group calculations indicated it was in the wrong group.

byte address  block      group  what                   first inumber
0x8B00020000  145752096  4448   inode table block 0    36438017
0x8B00027000  145752103  4448   inode table block 7    36438129
0x8B0002C000  145752108  4448   inode table block 12   36438209
0x8B00209000  145752585  4448   inode table block 489  36445841
0x8B0029A000  145752730  4449   inode table block 122  36448161

Since a block is 4096 bytes and each inode table entry is 256 bytes, there are 16 inodes per block. So I now have all 80 lost inode table entries by inumber.


Now let's turn to the journal. I wrote a small tool that dumps information in each block of the journal. Since the journal superblock was missing, there were two pieces of information that I needed for this that were lost:

  • whether the journal held 64-bit block numbers
  • whether the journal used version 3 checksums

Fortunately, if I forced one (or both) of these switches on, some of the descriptor blocks in the journal overflowed its block, proving that those flags were not set.

One awk script (fulllog.awk) later, I have a log of the form

0x0002A000 - descriptors
        0x0002B000 -> block 159383670
        0x0002C000 -> block 159383671
        0x0002D000 -> block 0
        0x0002E000 -> block 155189280
        0x0002F000 -> block 195559440
        0x00030000 -> block 47
        0x00031000 -> block 195559643
        0x00032000 -> block 195568036
        0x00033000 -> block 159383672
0x0002B000 - invalid/data block
0x0002C000 - invalid/data block
0x0002D000 - invalid/data block
0x0002E000 - invalid/data block
0x0002F000 - invalid/data block
0x00030000 - invalid/data block
0x00031000 - invalid/data block
0x00032000 - invalid/data block
0x00033000 - invalid/data block
0x00034000 - commit record
        commit time: 2014-12-25 16:53:13.703902604 -0500 EST

With this, another awk script (dumpallfor.awk) dumps all the blocks:

byte address  block      number of journaled blocks
0x8B00020000  145752096  6
0x8B00027000  145752103  10
0x8B0002C000  145752108  206
0x8B00209000  145752585  1
0x8B0029A000  145752730  0

So that last block is truly lost :( With any luck I can find out what files were there with debugfs's ncheck command.


So I have a bunch of blocks. And they all appear to differ! Now what?

I could go by the revocation records, but I can't seem to parse that structure meaningfully. I could go by the commit record timestamps, but before I try that, I want to see just how each inode table block differs. So I wrote another quick program (diff.go) to find that out.

For the most part, files that do differ differ only in timestamps, so we can just choose the file with the latest timestamps. We'll do that later. For all other files, we get this:

36438023 - size differs
36438139 - OSD1 (file version high dword) differs
36438209 - OSD1 differs

Hm, that's not good... The file with differing size will be a problem, and I have no idea what to do about the two OSD1 files. I also tried using debugfs's ncheck to see what the files were, but we don't have a match.

I then found out which block dumps have the latest timestamps for now (same repo, latest.go). The important thing to note is that I had the blocks scanned in chronological order by commit time. This is not necessarily the same as numerical order by block number; the journal is not always stored in chronologically increasing order.

As it turns out, however, the newest block (by commit time) is indeed the one with the latest timestamps!


Let's try these latest blocks and see if we can recover anything from them.

sudo dd if=BLOCKFILE of=DDRESCUEIMG bs=1 seek=BYTEOFFSET conv=notrunc

After that my home directory is back!


Now let's find out what those three differing files were...

Inode   Pathname
36438023    /pietro/.cache/gdm/session.log
36438209    /pietro/.config/liferea
36438139    /pietro/.local/share/zeitgeist/fts.index

The only important thing there is Liferea's configuration directory, but I don't think that was corrupted; it was one of the OSD1-differing ones.

And let's find out about those 16 inodes in the final block, the one that we could not recover:

Inode   Pathname
36448176    /pietro/k2
36448175    /pietro/Downloads/sOMe4P7.jpg
36448174    /pietro/Downloads/picture.png
36448164    /pietro/Downloads/tumblr_nfjvg292T21s4pk45o1_1280.png
36448169    /pietro/Downloads/METROID Super Zeromission v.2.3+HARD_v2.4.zip
36448165    /pietro/Downloads/tumblr_mrfex1kuxa1sbx6kgo1_500.jpg
36448173    /pietro/Downloads/1*-vuzP4JAoPf9S6ZdHNR_Jg.jpeg
36448162    /pietro/.cache/upstart/gnome-settings-daemon.log.6.gz
36448163    /pietro/.cache/upstart/dbus.log.7.gz
36448171    /pietro/.cache/upstart/gnome-settings-daemon.log.3.gz
36448161    /pietro/.local/share/applications/Knytt Underground.desktop
36448166    /pietro/Documents/Screenshots/Screenshot from 2014-12-03 15:47:29.png
36448170    /pietro/Documents/Screenshots/Screenshot from 2014-12-03 16:51:26.png
36448172    /pietro/Documents/Screenshots/Screenshot from 2014-12-03 19:08:54.png
36448168    /pietro/Documents/transactions/premiere to operating transaction 4305747926.pdf
36448167    /pietro/Documents/transactions/transaction 4315883542.pdf

In short:

  • a text file with only one or two things in that I could get back by brute force since I know that it has a date stamp and something that's also in my chat logs
  • some images downloaded from the internet; if I can't get the URLs back from Firefox's history then I can use photorec
  • a ROM hack that I can easily get on the Internet again =P
  • log files; no loss here
  • the .desktop file for a Steam game
  • screenshots; I can get these back with photorec assuming gnome-screenshot added the datestamp as metadata
  • bank account transaction records; if I can't get them from the bank I could probably use them with photorec

So not casualtyless but not a total loss, and I learned more about ext4 in the process. Thanks anyway!


UPDATE

Might as well put this out there:

NOT YET     /pietro/k2
FOUND       /pietro/Downloads/sOMe4P7.jpg
NOT YET     /pietro/Downloads/picture.png
FOUND       /pietro/Downloads/tumblr_nfjvg292T21s4pk45o1_1280.png
GOOGLEIT    /pietro/Downloads/METROID Super Zeromission v.2.3+HARD_v2.4.zip
FOUND       /pietro/Downloads/tumblr_mrfex1kuxa1sbx6kgo1_500.jpg
FOUND       /pietro/Downloads/1*-vuzP4JAoPf9S6ZdHNR_Jg.jpeg
UNNEEDED    /pietro/.cache/upstart/gnome-settings-daemon.log.6.gz
UNNEEDED    /pietro/.cache/upstart/dbus.log.7.gz
UNNEEDED    /pietro/.cache/upstart/gnome-settings-daemon.log.3.gz
UNNEEDED    /pietro/.local/share/applications/Knytt Underground.desktop
NOT YET     /pietro/Documents/Screenshots/Screenshot from 2014-12-03 15:47:29.png
NOT YET     /pietro/Documents/Screenshots/Screenshot from 2014-12-03 16:51:26.png
NOT YET     /pietro/Documents/Screenshots/Screenshot from 2014-12-03 19:08:54.png
NOT YET     /pietro/Documents/transactions/premiere to operating transaction 4305747926.pdf
NOT YET     /pietro/Documents/transactions/transaction 4315883542.pdf

And in case I'm not weird enough, the downloaded pictures were:

These were all shared by friends in chats.

I guess I'll keep this updated? (Not like it would make a difference...) I know I can recover everything; the only question is when =P

Related Question