How to use sed to replace two instances of the same digits separated by a slash with one instance of those digits

regular expressionsed

I want to use sed to replace two instances of the same digits separated by a slash with one instance of those digits. My input files have lines like this:

text (1982/1982) text
text (1983/1983) text
text (1984/1984) text

I want output like this:

text (1982) text
text (1983) text
text (1984) text

I have to match the parentheses because there may be other strings of digits separated by a slash in the input files.

In BBEdit I can do this with the search pattern \(([0-9]{4})/\1\) and the replace pattern \(\1\). But in sed the equivalent extended regular expressions do not seem to work:

echo 'text (1984/1984) text' | sed -E 's_\(([0-9]{4})/\1\)_\(\1\)_g'

returns:

text (1984/1984) text

but instead I want:

text (1984) text

What are the extended regular expressions that will do this in sed?

I am using the built-in sed in macOS.

Best Answer

The version of OSX's sed is quite annoying (it's actually the BSD's version). I usually install GNU's sed via brew:

$ brew search sed
==> Formulae
gnu-sed ✔             libxdg-basedir        minised               ssed

==> Casks
eclipse-dsl                                  marsedit
exoduseden                                   microsoft-bing-ads-editor
focused                                      osxfuse-dev
google-adwords-editor                        physicseditor
lego-mindstorms-education-ev3                prefs-editor
licensed                                     subclassed-mnemosyne

Install it:

$ brew install gnu-sed

You can then use it like so:

$ gsed ....

And voila, your example now works:

$ echo 'text (1984/1984) text' | sed -E 's_\(([0-9]{4})/\1\)_\(\1\)_g'
text (1984/1984) text
$ echo 'text (1984/1984) text' | gsed -E 's_\(([0-9]{4})/\1\)_\(\1\)_g'
text (1984) text

References

Related Question