How to keep a part of the pattern matched and use it to replace on BSD sed

regular expressionsed

I have found a soultion for GNU sed, and I also found something like \1, but the Terminal tells me it's "not defined in RE".

But what I want to do is this:

If I have a string that looks like this sinder-city1.gif, I want to make that sinder-city.gif, now I can't just straight up replace the entire string, because I will do this to many different strings, nor can I just remove all numbers before .gif or some other pattern because if there is

sinder-city2.gif
sinder-city3.gif

I want them to stay intact.

I don't want to replace that. For matching that, I type sed 's,[a-z]1.gif, but if I remove that, I will be left with sinder-cit.gif. How do I match that y?

I want to do like this:

sed 's,[a-z]1.gif,[here is the last letter].gif,g'

It has to work on BSD sed.

Best Answer

sed 's,\([a-z]\)1\.gif$,\1.gif,g'

or, if you want to allow any non-digit before the 1

sed 's,\([^0-9]\)1\.gif$,\1.gif,g'

The backslash-parenthesis construct delimits a capture group, which the FreeBSD man page calls a “bracket expression” (despite the use of parentheses — square brackets mean something else). Note that sed uses basic regular expressions (BRE), not extended regular expressions (ERE); the man page describes ERE, and the last paragraph explains the difference between BRE syntax and ERE syntax. I find the POSIX specification more readable than the BSD man page here; it calls capture groups back-reference expressions. The GNU sed manual is more readable than either; just avoid the features described as GNU extensions.

Given a capture group (a.k.a. back-reference expression), you can use backslash+digit in the replacement text to mean “the text matched by the corresponding capture group”. For example, \1 in the replacement text is replaced by the text matched by the first capture group in the regular expression. Here there's a single capture group, which captures the letter before 1.gif.

I changed 1.gif to 1\.gif to match the dot literally, and added a trailing $ to match only at the end of the line.

To give another example of capture groups, if you wanted to operate on arbitrary extensions, you could use something like

sed 's,\([^0-9]\)1\(\.[^./]*\)$,\1\2,g'
Related Question