I have 20 tab delimited files with the same number of rows. I want to select every 4th column of each file, pasted together to a new file. In the end, the new file will have 20 columns with each column come from 20 different files.
How can I do this with Unix/Linux command(s)?
Input, 20 of this same format.
I want the 4th column denoted here as A1 for file 1:
chr1 1734966 1735009 A1 0 0 0 0 0 1 0
chr1 2074087 2083457 A1 0 1 0 0 0 0 0
chr1 2788495 2788535 A1 0 0 0 0 0 0 0
chr1 2821745 2822495 A1 0 0 0 0 0 1 0
chr1 2821939 2822679 A1 1 0 0 0 0 0 0
...
Output file, with 20 columns, each column coming from one of the 20 files' 4th column:
A1 A2 A3 ... A20
A1 A2 A3 ... A20
A1 A2 A3 ... A20
A1 A2 A3 ... A20
A1 A2 A3 ... A20
...
Best Answer
with
paste
under bash you can do:With a python script and any number of files (
python scriptname.py column_nr file1 file2 ... filen
):