Ubuntu – Understanding sed expression to replace the last word from each line with the first one

command linesedtext processing

I have to replace the last word from each line with the first one. The code is:

$ sed "s/\(^a-z,0-9]*\)\(.*\)\([a-z,0-9]*$\)/\1\2\1\g". 

I don't understand this part \(^a-z,0-9]*\)\(.*\)\([a-z,0-9]*$\) especially \(.*\).

Best Answer

After correcting the basic syntax mistakes, you have:

sed "s/\(^[a-z,0-9]*\)\(.*\)\([a-z,0-9]*$\)/\1\2\1/g"
  • s/old/new/ replace old with new
  • \(^[a-z,0-9]*\) save any number of lowercase letters or numbers at the start of the line (^ is start of line) for later (reference later with \1)
  • \(.*\) Save any number of any characters for later (to reference as \2)
  • \([a-z,0-9]*$\) save any number of lowercase letters or numbers at the end of the line ($ is end of line) for later (reference as \3)
  • \1\2\1 print the first pattern, then the second, then the first again
  • g this is inappropriate in this expression. It means act on multiple matches on the same line, but our expression has to read the whole line, so g makes no sense and should be omitted.

This still will not work, because regular expressions are greedy, so the middle \(.*\) matches everything after the first word, resulting in the first word being reprinted at the end of the line without replacing anything.

You could fix it (also adding I for case-insensitive search):

sed "s/\(^[a-z,0-9]*\) \(.*\) \([a-z,0-9]*$\)/\1 \2 \1/I"

If you wanted to include other characters besides letters and numbers:

sed -r 's/^([^ ]+) (.*) ([^ ]+)$/\1 \2 \1/'
  • -r use ERE (saves using all those backslashes)
  • [^ ]+ at least one of any characters except spaces