Lum – AWK – a question about columns

I have a question. I was trying to deal with it by myself, but it seems like I am too new in awk to make it work.

Let's assume that we have a file (eg. database.txt) (values are tab-separated):

NA64715 YU24921 MI84612 MI98142 NA94732    
3241531 4957192 4912030 6574918 0473625     
0294637 9301032 8561730 8175919 8175920     
9481732 9359032 8571930 8134983 9385130     
9345091 9385112 2845830 4901742 3455141

In a separate file (eg. populations.txt) I have information about which ID belongs to which group, eg.:

NA64715 Europe    
YU24921 Europe    
MI84612 Asia    
MI98142 Africa    
NA94732 Asia

What I need to do is to force awk to create separate files with columns for all groups (Europe, Asia, Africa). The file I need to work on is huge, so I cannot simply count and number columns and do it the easy way. I need awk to check which ID belongs to which population (Europe etc.), then find that particular column in a database file, and then copy a whole column to a new file (separate for all the populations).

The result should look like:

File 1 (europe.txt):

NA64715 YU24921     
3241531 4957192     
0294637 9301032     
9481732 9359032    
9345091 9385112

File 2 (asia.txt)

MI84612 NA94732    
4912030 0473625    
8561730 8175920    
8571930 9385130    
2845830 3455141

File 3 (africa.txt)

Can anyone help me with this issue?

awk -F '\t' ' NR==FNR {population[$1]=$2; next} FNR==1 { for (i=1; i<=NF; i++) { destination[i] = population[$i] ".txt" } } { delete separator for (i=1; i<=NF; i++) { printf "%s%s", separator[destination[i]], $i > destination[i] separator[destination[i]] = FS } for (file in separator) { printf "\n" > file } } ' populations.txt database.txt

Lum – AWK – a question about columns

Best Answer

Related Question

Best Answer

Related Solutions

Read a two-character column as two separate columns

Print columns in awk by header name

Related Question