Rearrange the data based on a pattern

awksedtext processing

I've a file like this,

A1: abc.com B1: Hi there
B1: Your Test mail  A1: gml.com
B1: Your new mail   A1: hml.com
A1: def.com B1: Test email
B1: hello world A1: yml.com

I want to always pick the A1: <string> part first followed by the B1: <string> part.

I've tried grep and awk like below

 grep -Po '(?<=A1:)\W*\K[^ ]*' file.txt 
 awk -F"A1:|B1:" '{print $1 $2}' file.txt 

But they're not giving the exact results

I want the output to be like this:

 A1: abc.com   B1: Hi there
 A1: gml.com   B1: Your Test mail   
 A1: hml.com  B1: Your new mail 
 A1: def.com  B1: Test email
 A1: yml.com  B1: hello world

Best Answer

You could leave the lines starting with A1 as is and re-arrange those starting with B1

# if -E or -r is not supported: sed 's/\(B1:.*\)\(A1:.*\)/\2 \1/' ip.txt
$ sed -E 's/(B1:.*)(A1:.*)/\2 \1/' ip.txt
A1: abc.com B1: Hi there
A1: gml.com B1: Your Test mail  
A1: hml.com B1: Your new mail   
A1: def.com B1: Test email
A1: yml.com B1: hello world 
  • .* is greedy, so this solution assumes that A1: and B1: are unique in each line
  • (B1:.*)(A1:.*) are two capture groups - to satisfy the entire expression, the first one will capture all string from B1: up to just before A1:. The second one will capture string from A1: till end of line
  • \2 \1 re-arranging the captured strings with space in between
  • Further reading: https://www.gnu.org/software/sed/manual/sed.html#Back_002dreferences-and-Subexpressions


With awk

$ awk -F'A1:' '{print $1 ~ /B1:/ ? FS $2 " " $1 : $0}' ip.txt
A1: abc.com B1: Hi there
A1: gml.com B1: Your Test mail  
A1: hml.com B1: Your new mail   
A1: def.com B1: Test email
A1: yml.com B1: hello world 

If first field contains B1:, re-arrange the fields, else print the input line as is

Related Question