sed 's,\([a-z]\)1\.gif$,\1.gif,g'
or, if you want to allow any non-digit before the 1
sed 's,\([^0-9]\)1\.gif$,\1.gif,g'
The backslash-parenthesis construct delimits a capture group, which the FreeBSD man page calls a “bracket expression” (despite the use of parentheses — square brackets mean something else). Note that sed uses basic regular expressions (BRE), not extended regular expressions (ERE); the man page describes ERE, and the last paragraph explains the difference between BRE syntax and ERE syntax. I find the POSIX specification more readable than the BSD man page here; it calls capture groups back-reference expressions. The GNU sed manual is more readable than either; just avoid the features described as GNU extensions.
Given a capture group (a.k.a. back-reference expression), you can use backslash+digit in the replacement text to mean “the text matched by the corresponding capture group”. For example, \1
in the replacement text is replaced by the text matched by the first capture group in the regular expression. Here there's a single capture group, which captures the letter before 1.gif
.
I changed 1.gif
to 1\.gif
to match the dot literally, and added a trailing $
to match only at the end of the line.
To give another example of capture groups, if you wanted to operate on arbitrary extensions, you could use something like
sed 's,\([^0-9]\)1\(\.[^./]*\)$,\1\2,g'
I'm not sure why Perl isn't acceptable here. On the inputs you provided, this line gives the output you asked for:
perl -0777p -e 's/.* > (.*) joined the channel\.\n(((?!.* \1 (was kicked from channel\.|was banned from channel\.)\n).*\n)+?.*\1 disconnected)/\2/mg' irc.txt
The -e
argument is exactly the first argument to your magicregextool
except that I added the /mg
regex modifier. This may not be "unmodified" but it doesn't seem unreasonable either. If you don't want to type in the whole line, how about this script as magicregextool
:
#!/usr/bin/perl -0777p
BEGIN { $::arg = shift @ARGV; }
eval $arg;
Or even:
#!/bin/sh
perl -0777pe $*
Then you just type:
magicregextool 's/.* > (.*) joined the channel\.\n(((?!.* \1 (was kicked from channel\.|was banned from channel\.)\n).*\n)+?.*\1 disconnected)/\2/mg' irc.txt
Which is the same as your sample (again other than adding the /mg
modifier).
An additional benefit to this is that if you are running multiple related search/replace operations on each file, you can put them together in the same script:
#!/usr/bin/perl -0777p
s/.* > (.*) joined the channel\.\n(((?!.* \1 (was kicked from channel\.|was banned from channel\.)\n).*\n)+?.*\1 disconnected)/\2/mg;
s/(some other\n)matched text/\1/mg;
Best Answer
vi is inspired by ex, ex is inspired by ed, ed is inspired by qed
QED was hacked together by Ken Thompson way back in the late 1960's for MIT's "Compatible Time-Sharing System" (a previous version for the Berkeley Timesharing System was created by Butler Lampson, L. Peter Deutsch, and Dana Angluin) — in short Thompson added regex in qed (he did a lot more than that, but it's outside the scope of this answer. -- Bell Labs has more on the history of QED)
One command in qed was the "G" or "Global" command. It allowed you to operate on all lines in the file at once (the previous version of qed were character-oriented instead of line oriented.)
Grep is actually named for one of the uses of this command G/re/P (
G
global,re
regular expression,P
print) in qed this command was used likeG/bash/P
to print out all lines containing the word bash — this was later included in ed, then taken out of ed and made into a standalone function (according to Doug McIlroy, he asked Ken to do it for him & Ken left it on his desk the next morning)