Lum – Join every other column with sed or awk

awkcolumnssed

I have a large text file (666000 colums) in the format

A B C D E F

Desired output

AB CD EF

How can we do it in sed or awk. I have tried a couple of things but nothing seems to be working. Please suggest something.

Best Answer

In sed:

sed 's! \([^ ]\+\)\( \|$\)!\1 !g' your_file

This will make the substitutions and print the result to standard out. To modify the file in place, add the -i switch:

sed -i 's! \([^ ]\+\)\( \|$\)!\1 !g' your_file

Explanation

This sed command will look for a space, followed by at least one non-space character, followed by a space or the end of the line. It substitutes this sequence with whatever non-space characters it found followed by a single space. The substitution is applied as many times as possible across the line (this is called a global substitution) because the g modifier is supplied at the end. So, basically, with a sequence like A B C, sed will find the pattern " B " and substitute it with "B " leaving you with AB C as the final result.

Assumptions made by this code

This code assumes the spaces between your columns are really spaces and not TABs for example. This can be easily fixed at the expense of readability:

sed 's![[:blank:]]\+\([^[:blank:]]\+\)\([[:blank:]]\+\|$\)!\1 !g' your_file
Related Question