How to add a string to a .txt file in all rows excluding few characters using sed or awk

awksedtext processing

I have a text file named xid.txt:

xid: SC48028 id: artf398444
xid: indv1000 id: indv24519
xid: SC32173 id: artf398402
xid: SC21033 id: artf398372
xid: 1001 id: tracker4868
xid: wiki1000 id: wiki10709
xid: proj1234 id: proj12556

I need to add a string 'PT_' before 'SC48028' , 'SC32173' … so on. The string 'SC…' can start with any combination may be 'AC…' or 'DL..'

Required output:

xid: PT_SC48028 id: artf398444
xid: indv1000 id: indv24519
xid: PT_SC32173 id: artf398402
xid: PT_SC21033 id: artf398372
xid: 1001 id: tracker4868
xid: wiki1000 id: wiki10709
xid: proj1234 id: proj12556

If you see the above output, we should not insert 'PT_' before strings which start with 'i' , 'p' , 'w' & 'numerical' . I have tried a few basic commands for my requirement using insert/append in sed.

Best Answer

With awk:

awk '$2~/^[A-Z][A-Z]/{ $2="PT_"$2 }1' xid.txt

The output:

xid: PT_SC48028 id: artf398444
xid: indv1000 id: indv24519
xid: PT_SC32173 id: artf398402
xid: PT_SC21033 id: artf398372
xid: 1001 id: tracker4868
xid: wiki1000 id: wiki10709
xid: proj1234 id: proj12556

  • $2~/^[A-Z][A-Z]/ - if the 2nd field starts with 2 uppercase letters

Or sed approach:

sed -i 's/^\(xid:[[:space:]]*\)\([A-Z]\{2\}[^[:space:]]*\)/\1PT_\2/' xid.txt
Related Question