Read past end of file to recover data

data-recovery

A very old .swp file reverted a file I was editing, so it is now significantly shorter. I haven't done anything in that directory since, so the bytes immediately following the end of the file should still have my data. What function can I use to read N bytes from a given memory address? dd and read stop at file boundaries, unless I missed an option somewhere.

The current file size is 3.2 KB. I don't remember exactly how big the file was before it was truncated, but probably not more than 10 KB. How can I read 10 KB from the beginning of the file, ignoring file boundaries? It is fine if the data is not perfectly preserved, as long as I don't have to start from scratch.

Best Answer

Usually when editors save files, they delete or truncate to 0, thus freeing allocated space, and then write, which allocates new space. This results in the filesystem putting the data in a completely different physical location. So your idea might not work.

You can get the physical location of a file using filefrag or hdparm --fibmap, and then use dd to read that physical location directly. I've described this process in a different context here: https://unix.stackexchange.com/a/85880/30851


In your case it's more likely you need the general approach for finding textual data... something like:

strings -n 12 -t d /dev/partition | grep -F 'text snippet'

strings will look for consecutive ASCII data (also supports some other encodings, not sure about UTF-8. If it's code or English you won't need it) and it will also print the offset where it was found.

text snippet should be an exact, unique text sample you remember being in the part of the file you're looking for [in a single line]. (If you don't know it exactly, you could grep with regular expressions instead.)

-n 12 is the minimum length that strings will look for. 12 should be the length of your text snippet. This parameter is optional, if provided it might help strings | grep to go a little faster.

It will take a long time to read the entire partition but if successful, you'll have an offset you can feed to dd to grab the general area and then remove stuff that does not belong.

I haven't done anything in that directory since

If your directory doesn't happen to be a mountpoint... most filesystems don't really reserve space "per directory" so... any and all writes in the entire filesystem might overwrite the bit you're looking for. In a data recovery situation, you usually switch the entire thing into read-only mode.

Related Question