Take two columns in a tab delimited file and merge into one

command linetext processing

I was wondering how I would take data that was in this format as a tab-delimited file:

A  red     green  
B  yellow  orange  
C  blue    purple  

And to use commands like grep, paste, cut, cat, etc. to turn it into the following:

A red
B yellow
C Blue
A green
B orange
C purple

Best Answer

Similar to cut , you can also do it with awk:

$ awk '{print $1,$2}' aa.txt && awk '{print $1,$3}' aa.txt
A red
B yellow
C blue
A green
B orange
C purple
# OR to send the output in a new file:
$ (awk '{print $1,$2}' aa.txt && awk '{print $1,$3}' aa.txt) >aaa.txt

The difference is that awk handles better the white space than cut. This is useful if fields in each line are separated with more than one space.

For example if the file line is A red = one space separated, then cut solution as advised can do it also successfully, but if the line is A red = 3 spaces , then cut will fail, while awk will succeed to get fields 1 and 2 or fields 1 and 3.

Update:
As advised in comments (thanks don_crissti) this can also be done in pure awk:

awk 'BEGIN{FS=OFS=" "}{z[NR]=$1FS$3; print $1,$2}END{for (i=1; i<=NR; i++){print z[i]}}' a.txt

Explanation:

FS           : Input Field Separator
OFS          : Output Field Separator
FS=OFS=" "   : input & output field separator is set to "space"
z[NR]        : Creating an array with name 'z' and index the record number: 
             z[1] for first line, z[2] for second line , z[3] for third line
z[NR]=$1FS$3 : to each array element assign field1-FieldSeparator FS=space)-field2
So for first line the fields1=A and Fields 3=green will be stored in z[1] => equals to z[1]="A green"

print $1,$2  : Justs prints on screen 1stfield (A) and 2ndfield (red) of the current line, printed separated by OFS

When the file is finished (END) then with a for loop we print out the whole z array entries => print z[i]
For i=1 => print z[1] => prints "A green"
For i=2 => print z[2] => prints "B orange"
For i=3 => print z[3] => prints "C purple"

PS: If fields are not separated by space but by tab , then Begin section of this awk one-liner must be changed to `awk 'BEGIN {FS=OFS="\t"}....`
Related Question