text-processing sed – Write a sed One-Liner to Add Character After Every Third Character

sedtext processing

So, I have a string that looks like this:

AUGGCCAUGGCGCCCAGAACUGAGAUCAAUAGUACCCGUAUUAACGGGUGA

And I want to split the string into 3-character chunks delimited by a '+' sign.

AUG+GCC+AUG+GCG+CCC+AGA+ACU+GAG+AUC+AAU+AGU+ACC+CGU+AUU+AAC+GGG+UGA

And I want to do that with my good friend sed.

I tried

cat codons | sed -r 's/([A-Z]\{3\})/\1\+/g'

…with no success.

What sed command can I use?

Best Answer

Since you don't want a trailing +, you could do:

fold -w3 | paste -sd+ -

That is, fold the lines on 3 character width, and paste those 3 character lines with themselves with + as the delimiter which in effect is like changing every newline character but the last one into a +. If the input had more than one line, you'll end up with those lines joined with a + which may or may not be what you want.

If you do need it to be sed, you can remove the trailing + after:

sed 's/.../&+/g;s/+$//'
Related Question