Bash – Match lines of a file with headers in other to obtain entire para

awkbashperl

I want help with scripting to work with two files, wherein file 1 lists amino acids which are in specific order (one below the other and also might repeat) and the second file 2 constitutes the characteristics feature listed under each amino acid. Here, I am trying to match the amino acid from list one (file 1) to obtain its characteristics features listed under the same amino acid of the second file (file 2)and copy it to an output file in the same order as mentioned in file 1.

For example
File1.txt

    Threonine
    Glutamine
    Alanine
    Asparatate
    Glutamine
    Alanine
    Threonine

File2.txt

    [ Alanine ] 
    89.1    13.7    -3.12   -10.09
    [ Asparatate ]  
    133.1   30  -2.43   -10.35
    [ Glutamine ]   
    146.1   42.7    -3.46   -10.23
    [ Threonine ]   
    119.1   28.5    -2.43   -9.99   

The output I am expecting is as below:
output.txt

    [ Threonine ]   
    119.1   28.5    -2.43   -9.99
    [ Glutamine ]   
    146.1   42.7    -3.46   -10.23
    [ Alanine ] 
    89.1    13.7    -3.12   -10.09
    [ Asparatate ]  
    133.1   30  -2.43   -10.35
    [ Glutamine ]   
    146.1   42.7    -3.46   -10.23
    [ Alanine ] 
    89.1    13.7    -3.12   -10.09 
    [ Threonine ]   
    119.1   28.5    -2.43   -9.99

I have tried using the below script in awk, which works with numbers as index other than words but not for this purpose.

awk 'FNR==NR { a[ "\\[ " $1 " \\]" ]; next } /^\[/ { f=0 } { for (i in a) if ($0 ~ i) f=1 } f' file1.txt file2.txt > output.txt

I am not knowing how to modify the script to make it work on the words even. Please tell me where I am going wrong and help me execute the script to get the output as desired.

I will highly appreciate your help.

Thanks in advance.

Asha

Best Answer

Everything what you need to loop through acids in File1.txt and find matched line in File2.txt + 1 line which easy done by grep

for acid in $(sed 's/^\s*//' File1.txt)
do
    grep -FA1 "$acid" File2.txt
done > Output.txt

But if you like awk:

awk '
FNR!=NR{
    print "    [",$1,"]"
    print acids[$1]
    next
}
/\[/{
    acid=$2
    next
}
{
    acids[acid]=$0
}' File2.txt File1.txt > Output.txt
Related Question