How to split a file alternating the prefix used for the output files

awksplittext processing

I have large file. It is made of parts of 40 lines each. There are two types of parts and they alternate. The two types of parts should be numbered independently. So the first part should be X_0001, the second part should be Y_0001, then X_0002, Y_0002, etc.

I used this command but it can only split into pieces with the same prefix:

 split -d -l 40 -a 4  inputfile X_ 

Best Answer

One way is to use split and rename the files afterwards.

But the simplest is probably to call awk. You can use the > redirection operator to write to a file instead of standard output. The variable NR contains the current line number.

Awk's redirection automatically takes care of opening files. You should close files explicitly if you use a lot of different ones, otherwise you might run into a limit on open files.

awk '
  (NR-1) % 40 == 0 { close(out); out = sprintf("%s_%04d", (NR % 80 == 1 ? "X" : "Y"), NR/80+1); }
  { print >out }
' inputfile
Related Question