Grep / parse text

awkgrepsedtext processing

I need to parse drug names from Medline abstracts. I was hoping to do this by getting outputs from grep -wf and grep -owf then using paste, but the outputs do not correspond, because grep -owf creates an output for each match, even if it is in the same line.

Pattern file:

DrugA
DrugB
DrugC
DrugD

File to parse:

In our study, DrugA and DrugB were found to be effective.  DrugA was more effective than DrugB.
In our study, DrugC was found to be effective
In our study, DrugX was found to be effective

Desired output:

DrugA    In our study, DrugA and DrugB were found to be effective. DrugA was more effective.
DrugB    In our study, DrugA and DrugB were found to be effective. DrugA was more effective.
DrugC    In our study, DrugC was found to be effective

Best Answer

It's not strictly grep alone, but this does the trick:

while IFS= read -r pattern; do
    grep "$pattern" input | awk -v drug="$pattern" 'BEGIN {OFS="\t"} { print drug,$0}'
done < "patterns"
Related Question