I've got a list with 250 lines in it. I have to run all of them through a web server to get a list of output. This list, however returns many more lines, than I'm interested in. Say, my list.txt
is:
a.1
b.1
etc
then the output is output.txt
:
a.1 a b c
a.2 b a b
a.3 d k o
b.1 b o p
b.2 o i y
b.3 p i y
etc
Is it possible to use the grep command to search for all words in list.txt in the output.txt and then generate "the wanted" list wanted.txt? I need the entire line in my output.txt
I'm new in scripting, but what I'd like is something such as
grep list.txt output.txt > wanted.txt
I haven't been able to find any examples of this
Best Answer
I'd ignore
grep
for this one. It's good for regular expressions but it doesn't look like you really need that here.comm
can compare two files and show you intersections. Using your exact examples:This is faster than any grep will be but it relies (heavily) on the files being sorted. If they aren't, you can pre-sort them but that will alter the output so it's sorted too.
Alternatively, this answer from iiSeymour will let you do it with
grep
. The flags ask for an input file and force a fixed-string, full-word search. This won't rely on order but will be based on theoutput.txt
order. Reverse the files if you want them in the order of the list.txt.If your
list.txt
is really big, you might have to tackle this a little more iteratively and pass each line to grep separately. This will massively increase processing time. In the above you'd be readingoutput.txt
once, but this way you'd read and process it for every list.txt line. It's horrible... But it might be your only choice. On the upside, it does then sort things by thelist.txt
order.