I accidentally lost a pdf file during the following process
-
I was running a pdf software application PDFXCView in Wine in Ubuntu 18.04, to open a pdf file in a ext4 filesystem.
-
Then I
mv
the pdf file somewhere else. - Then I edited the pdf file already opened in PDFXCView. When I tried to save the edited file, I had to choose "save as…" to locate the current path of the file and attempted to overwrite it. But PDFXCView failed to overwrite the file, furthermore made it disappear and then aborted
.
Here are some attempts.
-
If it can be helpful, I remember the pathname of the lost pdf file.
-
I couldn't backup the partition of the filesystem by
dd
, since I
don't have an additional hard drive with big enough capacity for the
partition. -
I tried
debugfs
according to
https://unix.stackexchange.com/a/80285,$ sudo debugfs -w /dev/sda4 debugfs: lsdel Inode Owner Mode Size Blocks Time deleted 22549259 1000 100600 141 1/ 1 Sat Apr 2 09:14:06 2016 1 deleted inodes found. debugfs: logdump -i 22549259 22549259: File not found by ext2_lookup
The file was just lost instead of being deleted in 2016, so I am not
sure if it found the correct inode. -
I saw in https://unix.stackexchange.com/a/98700/ that says using
grep -a -C 500 'known pattern' /dev/sda | tee /tmp/recover
to recover a text file which contains a known pattern.
A while ago, I created the lost pdf file by concatenating several
smaller pdf files usingpdftk
and I still have those smaller
files. From one smaller pdf file, I can see the binary content of a smaller pdf file bycat smaller.pdf | less
, which contains a readable pdf format specific string/URI (http://flask.pocoo.org/docs/1.0/api/#flask.Flask.logger)
So I tried:
sudo grep -a -C 500 'http://flask.pocoo.org/docs/1.0' /dev/sda4 > /tmp/test/recover
Because those small files and the lost file both contain the string, and
-C 500
is too arbitrary to specify the begin and end of a file. I am not sure it can produce useful results.
I was wondering what ways I may try to recover the pdf file?
Thanks!
Best Answer
Definitely start with leaving the partition with the data alone, if at all possible (you would be surprised what you can recover even a month later if it is not your main system partition). Then proceed with
foremost
(I originally mentionedmagicrescue
butforemost
performs just as well, yet it has a ready receipe forpdf
I just ran it for a few seconds on one of my
/dev/sdX
drives and pulled 370 pdf files. The files will have no original names and will look like this:14348984.pdf
so the-i
flag is pretty important.Good luck.
Update
Your second option is
testdisk
/photorec
which in your case may be easier when dealing with the known path.testdisk
andphotorec
do have some caveats that if not careful (and happen to confirm multiple dialogs asking if you want to apply changes) can lead to disk damage, but it you take it slow, it may be more appropriate, and faster as it will likely show you a good folder tree structure with a node corresponding to your missing file. If you do not find your file withforemost
in let's say 2 hours, post a comment and I will provide a secondarytestdisk
approach.Update 2
When I just tested it,
testdisk
crushedforemost
in terms of locating a specific deleted file. It preserved the folder tree and filename structure perfectly, thus limiting the time spent creating every*.pdf
file. The two approaches differ substantially, and if the file is very important, I would definitely use bothtestdisk
andforemost
to locate the same file to be sure I end up with a full non-corrupted file.