I have a question. I was trying to deal with it by myself, but it seems like I am too new in awk
to make it work.
Let's assume that we have a file (eg. database.txt) (values are tab-separated):
NA64715 YU24921 MI84612 MI98142 NA94732
3241531 4957192 4912030 6574918 0473625
0294637 9301032 8561730 8175919 8175920
9481732 9359032 8571930 8134983 9385130
9345091 9385112 2845830 4901742 3455141
In a separate file (eg. populations.txt
) I have information about which ID belongs to which group, eg.:
NA64715 Europe
YU24921 Europe
MI84612 Asia
MI98142 Africa
NA94732 Asia
What I need to do is to force awk
to create separate files with columns for all groups (Europe, Asia, Africa). The file I need to work on is huge, so I cannot simply count and number columns and do it the easy way. I need awk
to check which ID belongs to which population (Europe etc.), then find that particular column in a database file, and then copy a whole column to a new file (separate for all the populations).
The result should look like:
File 1 (europe.txt
):
NA64715 YU24921
3241531 4957192
0294637 9301032
9481732 9359032
9345091 9385112
File 2 (asia.txt
)
MI84612 NA94732
4912030 0473625
8561730 8175920
8571930 9385130
2845830 3455141
File 3 (africa.txt
)
MI98142
6574918
8175919
8134983
4901742
Can anyone help me with this issue?
Best Answer
This works in one pass through the file, and does not need to store the whole file in memory. It does keep open file descriptors for each destination file.