I managed to shoot myself where it hurts (really bad) by reformatting a partition that held valuable data. Of course it was not intentional, but it happened.
However, I managed to use testdisk
and photorec
to recover most of the data. So now I have all that data distributed over almost 25,000 directories. Most of the files are .txt files, while the rest are image files. There are more than 300 .txt files in each directory.
I can grep
or use find
to extract certain strings from the .txt files and output them to a file. For example, here's a line that I've used to verify that my data is in the recovered files:
find ./recup*/ -name '*.txt' -print | xargs grep -i "searchPattern"
I can output "searchPattern" to a file, but that just gives me that pattern. Here's what I really would like to accomplish:
Go through all the files and look for a specific string. If that string is found in a file, cat ALL the contents of that file to an output file. If the pattern is found in more than one file, append the contents of subsequent files to that output file. Note that I just don't want to output the pattern I'm searching for, but ALL the contents of the file in which the patterns is found.
I think this is doable, but I just don't know how to grab all the contents of a file after grepping a specific pattern from it.
Best Answer
If I understand your goal correctly, the following will do what you want:
This will look for all
*.txt
files in./recup*/
, test each one forsearchPattern
, if it matches it'llcat
the file. The output of allcat
ed files will be directed intooutputfile.txt
.Repeat for each pattern and output file.
If you have a very large number of directories matching
./recup*
, you might end up with aargument list too long error
. The simple way around this is to do something like this instead:This will match the full path. So
./recup01234/foo/bar.txt
will be matched. The-mindepth 2
is so that it won't match./recup.txt
, or./recup0.txt
.