All right, so for the first question it turns out the debugfs
stats
command tells what the starting blocks for every section of a group are. In addition, I guessed that inumbers had to be consecutive and increasing, so basic addition of the offset into the inode table and the imap
command gave me the first inumbers; it also confirmed my suspicion about the last bad sector, where my block group calculations indicated it was in the wrong group.
byte address block group what first inumber
0x8B00020000 145752096 4448 inode table block 0 36438017
0x8B00027000 145752103 4448 inode table block 7 36438129
0x8B0002C000 145752108 4448 inode table block 12 36438209
0x8B00209000 145752585 4448 inode table block 489 36445841
0x8B0029A000 145752730 4449 inode table block 122 36448161
Since a block is 4096 bytes and each inode table entry is 256 bytes, there are 16 inodes per block. So I now have all 80 lost inode table entries by inumber.
Now let's turn to the journal. I wrote a small tool that dumps information in each block of the journal. Since the journal superblock was missing, there were two pieces of information that I needed for this that were lost:
- whether the journal held 64-bit block numbers
- whether the journal used version 3 checksums
Fortunately, if I forced one (or both) of these switches on, some of the descriptor blocks in the journal overflowed its block, proving that those flags were not set.
One awk script (fulllog.awk
) later, I have a log of the form
0x0002A000 - descriptors
0x0002B000 -> block 159383670
0x0002C000 -> block 159383671
0x0002D000 -> block 0
0x0002E000 -> block 155189280
0x0002F000 -> block 195559440
0x00030000 -> block 47
0x00031000 -> block 195559643
0x00032000 -> block 195568036
0x00033000 -> block 159383672
0x0002B000 - invalid/data block
0x0002C000 - invalid/data block
0x0002D000 - invalid/data block
0x0002E000 - invalid/data block
0x0002F000 - invalid/data block
0x00030000 - invalid/data block
0x00031000 - invalid/data block
0x00032000 - invalid/data block
0x00033000 - invalid/data block
0x00034000 - commit record
commit time: 2014-12-25 16:53:13.703902604 -0500 EST
With this, another awk script (dumpallfor.awk
) dumps all the blocks:
byte address block number of journaled blocks
0x8B00020000 145752096 6
0x8B00027000 145752103 10
0x8B0002C000 145752108 206
0x8B00209000 145752585 1
0x8B0029A000 145752730 0
So that last block is truly lost :( With any luck I can find out what files were there with debugfs
's ncheck
command.
So I have a bunch of blocks. And they all appear to differ! Now what?
I could go by the revocation records, but I can't seem to parse that structure meaningfully. I could go by the commit record timestamps, but before I try that, I want to see just how each inode table block differs. So I wrote another quick program (diff.go
) to find that out.
For the most part, files that do differ differ only in timestamps, so we can just choose the file with the latest timestamps. We'll do that later. For all other files, we get this:
36438023 - size differs
36438139 - OSD1 (file version high dword) differs
36438209 - OSD1 differs
Hm, that's not good... The file with differing size will be a problem, and I have no idea what to do about the two OSD1 files. I also tried using debugfs
's ncheck
to see what the files were, but we don't have a match.
I then found out which block dumps have the latest timestamps for now (same repo, latest.go
). The important thing to note is that I had the blocks scanned in chronological order by commit time. This is not necessarily the same as numerical order by block number; the journal is not always stored in chronologically increasing order.
As it turns out, however, the newest block (by commit time) is indeed the one with the latest timestamps!
Let's try these latest blocks and see if we can recover anything from them.
sudo dd if=BLOCKFILE of=DDRESCUEIMG bs=1 seek=BYTEOFFSET conv=notrunc
After that my home directory is back!
Now let's find out what those three differing files were...
Inode Pathname
36438023 /pietro/.cache/gdm/session.log
36438209 /pietro/.config/liferea
36438139 /pietro/.local/share/zeitgeist/fts.index
The only important thing there is Liferea's configuration directory, but I don't think that was corrupted; it was one of the OSD1-differing ones.
And let's find out about those 16 inodes in the final block, the one that we could not recover:
Inode Pathname
36448176 /pietro/k2
36448175 /pietro/Downloads/sOMe4P7.jpg
36448174 /pietro/Downloads/picture.png
36448164 /pietro/Downloads/tumblr_nfjvg292T21s4pk45o1_1280.png
36448169 /pietro/Downloads/METROID Super Zeromission v.2.3+HARD_v2.4.zip
36448165 /pietro/Downloads/tumblr_mrfex1kuxa1sbx6kgo1_500.jpg
36448173 /pietro/Downloads/1*-vuzP4JAoPf9S6ZdHNR_Jg.jpeg
36448162 /pietro/.cache/upstart/gnome-settings-daemon.log.6.gz
36448163 /pietro/.cache/upstart/dbus.log.7.gz
36448171 /pietro/.cache/upstart/gnome-settings-daemon.log.3.gz
36448161 /pietro/.local/share/applications/Knytt Underground.desktop
36448166 /pietro/Documents/Screenshots/Screenshot from 2014-12-03 15:47:29.png
36448170 /pietro/Documents/Screenshots/Screenshot from 2014-12-03 16:51:26.png
36448172 /pietro/Documents/Screenshots/Screenshot from 2014-12-03 19:08:54.png
36448168 /pietro/Documents/transactions/premiere to operating transaction 4305747926.pdf
36448167 /pietro/Documents/transactions/transaction 4315883542.pdf
In short:
- a text file with only one or two things in that I could get back by brute force since I know that it has a date stamp and something that's also in my chat logs
- some images downloaded from the internet; if I can't get the URLs back from Firefox's history then I can use photorec
- a ROM hack that I can easily get on the Internet again =P
- log files; no loss here
- the .desktop file for a Steam game
- screenshots; I can get these back with photorec assuming gnome-screenshot added the datestamp as metadata
- bank account transaction records; if I can't get them from the bank I could probably use them with photorec
So not casualtyless but not a total loss, and I learned more about ext4 in the process. Thanks anyway!
UPDATE
Might as well put this out there:
NOT YET /pietro/k2
FOUND /pietro/Downloads/sOMe4P7.jpg
NOT YET /pietro/Downloads/picture.png
FOUND /pietro/Downloads/tumblr_nfjvg292T21s4pk45o1_1280.png
GOOGLEIT /pietro/Downloads/METROID Super Zeromission v.2.3+HARD_v2.4.zip
FOUND /pietro/Downloads/tumblr_mrfex1kuxa1sbx6kgo1_500.jpg
FOUND /pietro/Downloads/1*-vuzP4JAoPf9S6ZdHNR_Jg.jpeg
UNNEEDED /pietro/.cache/upstart/gnome-settings-daemon.log.6.gz
UNNEEDED /pietro/.cache/upstart/dbus.log.7.gz
UNNEEDED /pietro/.cache/upstart/gnome-settings-daemon.log.3.gz
UNNEEDED /pietro/.local/share/applications/Knytt Underground.desktop
NOT YET /pietro/Documents/Screenshots/Screenshot from 2014-12-03 15:47:29.png
NOT YET /pietro/Documents/Screenshots/Screenshot from 2014-12-03 16:51:26.png
NOT YET /pietro/Documents/Screenshots/Screenshot from 2014-12-03 19:08:54.png
NOT YET /pietro/Documents/transactions/premiere to operating transaction 4305747926.pdf
NOT YET /pietro/Documents/transactions/transaction 4315883542.pdf
And in case I'm not weird enough, the downloaded pictures were:
These were all shared by friends in chats.
I guess I'll keep this updated? (Not like it would make a difference...) I know I can recover everything; the only question is when =P
I've exchanged emails with the author of ddrescue, Antonio Diaz, and he told me that the correct parameter to use with an "advanced format" drive (i.e., a drive with 4096-byte physical sectors, but 512-byte "logical sectors") is:
-b4096
If you wanted it to read just one 4096-byte sector at a time (slow!) then you would also specify:
-c1
Antonio is not active on StackExchange, but he supports ddrescue via this email mailing list:
https://www.mail-archive.com/bug-ddrescue@gnu.org/
If you send your email to bug-ddrescue@gnu.org then your email will appear on that summary page, as will his answer, in nicely organized form (but without your email address shown, of course). Additionally, you may search on that page to try to find previous discussions of your issue or question, before bothering Antonio. (He is a very busy man, so please don't waste his time!)
The reason that your ddrescue logfile contains a 512-byte "bad" area is that you initially ran ddrescue with the default sector size of 512 bytes. That's not disastrous, but if ddrescue thinks the drive has 512 byte sectors, and a read is issued that returns 0 bytes of data due to a read error, then ddrescue assumes that only the first of 512 bytes are unreadable, and makes no assumption about the rest. So only 512 bytes is marked as bad in the logfile.
Best Answer
You'll need the block numbers of all encountered bad blocks (
ddrescue
should have given you a list, I hope you saved it), and then you'll need to find out which files make use of these blocks (see e.g. here). You may want to script this if there are a lot of bad blocks.e2fsck
doesn't help, it just checks consistency of the file system itself, so it will only act of the bad blocks contain "adminstrative" file system information.The bad blocks in the files will just be empty.
Edit
Ok, let's figure out the block size thingy. Let's make a trial filesystem with 512-byte device blocks:
So the filesystem block size is 1024, and we've 100 of those filesystem blocks (and 200 512-byte device blocks). Rescue it:
So the hex
ddrescue
units are in bytes, not any blocks. Finally, let's see whatdebugfs
uses. First, make a file and find its contents:So the byte address of the data is
0x5400
. Convert this to 1024-byte filesystem blocks:and let's also try the block range while we are at it:
So that works out as expected, except block 0 is invalid, probably because the file system metadata is there. So, for your byte address
0x30F8A71000
fromddrescue
, assuming you worked on the whole disk and not a partition, we subtract the byte address of the partition startDivide that by the
tune2fs
block size to get the filesystem block (note that since multiple physical, possibly damaged, blocks make up a filesystem block, numbers needn't be exact multiples):and that's the block you should test with
debugfs
.