What would be the best way to create a list of files that have common words with a given file. For example, if I had:
$ ls
mainFile file1 file2 file file4
$ cat mainFile
exquisite malicious sentient pulsating
perspicacious one
tawdry fumigate Baryshnikov O'connor
and I wanted to list any of the files in the cwd
that contained any one of the words in mainFile
. What would be the best way to go about this?
Since the number of words per line in mainFile
is not constant, I was finding solutions using cut
a little tricky. I was trying to create a string out of the words and then place them separated by |
in a grep -l "exquisite|malicious|etc" *
command. I'm open to any method though that might be better.
Best Answer
First generate indices for mainFile,
sed 's/ /\n/g' mainFile | sort | uniq > mainFile.idx
Then do a grep for fixed strings:
grep -F -f mainFile.idx file*