Background
I accidentally deleted an important python script, and so I ran the command
sudo grep --binary-files=text --context=100 'unique string' /dev/sda1 > recover_file
to search for it on my hard drive and save matches to ./recover_file
. When I open ./recover_file
in Vi ("Vi Unimproved", not Vim) I see that it is ~10800 lines long and contains many versions of my ~200 line file, with some junk between each occurrence, as expected. But there are also hundreds of strange lines with unexpected behavior that I will attempt to describe.
I have line numbers on. If line 19 is the first strange line in the file, upon opening the file I get a message a the bottom of the window saying
Conversion error on line 19
Initially the strange lines appear as empty lines, like the lines displayed at the bottom of a document when there are no more lines in the file to be displayed, with a ~
character at the far left edge of the window, but located between two other lines, not at the end of the file:
18 junk junk junk
~
20 junk junk junk
When I attempt to delete line 19 using dd
, nothing happens. If I delete a normal line, then line 19's appearance changes and it looks like any other blank line:
18 junk junk junk
19
20 junk junk junk
But as soon as I move my cursor over it, the line number disappears and it looks just like it did before. If I try to perform any operation on it, such as inserting or appending text, I get
Error: unable to retrieve line 19
If I write the file to the disk, I get
Error: recover_file: Invalid or incomplete multibyte or wide character.
recover_file: WARNING: FILE TRUNCATED.
Then, if I close and re-open the file, I see that all lines from 19 onward have been removed, leaving only lines 1-18. I was able to reproduce the situation and copy a recent version of the python file into a new file, after which further digging in ./recover_file
produced a segmentation fault and the entire file was lost.
Questions
1) For future reference, is there a way I can remove these strange lines so that I can save the file directly without losing important data, or will I always need to highlight and copy from the terminal window?
2) I assume that this behavior is due to the presence of binary code in ./recover_file
not corresponding to text characters, which Vi cannot render. If someone could confirm/correct this impression and perhaps provide further explanation, I would be grateful.
Update
I'm not sure if this is relevant, but I'm running lubuntu 18.04 as a virtual machine on VMware Workstation 14 Player.
Best Answer
From looking at your script, you are dumping and try to edit, searching and line editing, binary files with the
vi
text editor.In that way, you will encounter a lot of control characters that will subvert the notion of lines, length of lines and in some stituations, even possibly the end-of-file.
Since you are only interested in text, and you are already somewhat parsing the disk contents, I would add a
strings
command to it to discard non-text characters.To be able to handle your output in vi, you may thus change your script to:
I also suspect it will be more efficient to discard those control chars to begin with as in:
Though I am not entirely sure this last instruction with give the same results due to being handled as text and not binary.
From
man strings