I have a .txt
that can be exemplified like this:
NAME | CODE
name1 | 001
name2 | 001
name3 | 002
name4 | 003
name5 | 003
name6 | 003
I need to write a script to split this file according to the CODE
column, so in this case I'd get this:
file 1:
NAME | CODE
name1 | 001
name2 | 001
file 2:
NAME | CODE
name3 | 002
file 3:
NAME | CODE
name4 | 003
name5 | 003
name6 | 003
According to some research, using awk would work:
$ awk -F, '{print > $2".txt"}' inputfile
The thing is, I also need to include the header to the first line and I need the file names to be different. Instead of 001.txt
, for example, I need the file name to be something like FILE_$FILENAME_IDK.txt
.
Best Answer
You could try like this:
The above saves the header in a variable
h
(NR==1{h=$0; next}
) then, if$3
not seen (!seen[$3]++
i.e. if it's the first time it encounters the current value of$3
) it sets the filename (f=...)
and writes the header to filename (print h > f
). Then it appends the entire line to filename (print >> f
). It uses defaultFS
(field separator): blank. If you want to use|
asFS
(or even a regex withgnu awk
) see cas' comment below.