How are files laid out in ext2/ext3/ext4

data-recoveryext2ext3ext4

A few days ago all my metadata on a ext4 format flash card was overwritten.

I am now going to speculate on how this happened. This is pure speculation. It happened just after I used a different card. The volume label on the card is now the same as the other card. So I suspect that I failed to sync/unmount the other card, when I pulled it. The card reader does not properly notify the system when a card is pulled so next system generated sync, the system did not know that I changed cards, and it overwrote the metadata.

The first thing I did when this happened, is create an image using dd. The second thing I did was make the image read-only. The third thing I did is make a writeable copy of the image.

I discovered photorec, which managed to recover some things, but not all. I think one of the reasons for this is that it is nondestructive.

Since some of the recovered files are text, I suspect that photorec uses minimal information about the file format, if any.

To try and recover any of the other files from the card, I would need to know how files are laid out in ext2. I suspect the basic is that files are broken into blocks, which are written into sectors and information on how to find the next sector is written somehow in the present sector.

What I need top proceed is information on how the pointer to the next sector is written.

PS: I am reading the photorec code, but am having some problems reading it. Whether it's me or whether the code is ugly, I don't know.

PPS: I have found some information on how ext file systems are laid out, but can't seem to find basic file layout info.

Best Answer

PhotoRec scans a disk (or disk image) searching for contiguous chunks of bytes-that-look-like-known-file-formats (for example, it can recognize JFIF/EXIF (JPEG) by the segment headers). Pretty simple but limited.

The Sleuth Kit is a great tool for digging into filesystems. With a bit of care (and scripting its tools and hex-editing the disk image when it goes astray), it can be used for recovery.

For a tool that more deeply understands ext, try ext4magic? (I haven't yet had a need to use this yet.)

Documentation/filesystems/ext2.txt in the kernel sources has a high-level overview of the general structure. The Ext4 Wiki has good information, including Ext4 Disk Layout containing more details (largely applying to ext[23] as well).

But yes, a file's data is split up into blocks. In ext2, each file is represented by an inode which contains direct blocks (pointers to data blocks), indirect blocks (which contain pointers to direct blocks), double indirect blocks, and triple indirect blocks. There are no backlinks, so to find a data block's siblings, you'll have to scan all inodes and block pointers to find its owner first.

Related Question