I use PCRE regular expressions for search and replace very often when working with a text editor and I was left quite unhappy after I found out that in powerful Unix command line tools like perl
, awk
or sed
it's fairly complicated to use a bit advanced multiline regex and requires various hard to remember syntax for various situations.
Is there a command line tool for Linux in which search and replace (for all occurences in the whole file) using a more complex multiline regex is as simple as:
magicregextool 's/.* > (.*) joined the channel\.\n(((?!.* \1 (was kicked from channel\.|was banned from channel\.)\n).*\n)+?.*\1 disconnected)/\2/' file.txt
i.e. the regex to match is the same as I would place in the search for
field in a text editor, the replacement string can handle multiline regex as well and there's no need for any convoluted syntax?
EDIT:
Per request I'm attaching an input which I'd use the example regex above for and explaining what I want it to actually do.
An input like this:
2016-05-16 06:17:00 > foobar joined the channel.
2016-05-16 06:17:13 <foobar> hi
2016-05-16 06:18:30 > foobar was kicked from channel.
2016-05-16 06:18:30 > foobar disconnected
2016-05-16 06:20:13 > user joined the channel.
2016-05-16 06:20:38 <user> bye
2016-05-16 06:21:57 > user disconnected
should produce this output:
2016-05-16 06:17:00 > foobar joined the channel.
2016-05-16 06:17:13 <foobar> hi
2016-05-16 06:18:30 > foobar was kicked from channel.
2016-05-16 06:18:30 > foobar disconnected
2016-05-16 06:20:38 <user> bye
2016-05-16 06:21:57 > user disconnected
The regex matches any line that contains [username] joined the channel
and looks for a line below it that contains [username] disconnected
unless there is a [username] was kicked from channel.
or [username] was banned from channel.
between those 2 lines.
The replacement string then replaces the matched pattern with every line following the line with [username] joined the channel
effectively deleting the line 2016-05-16 06:20:13 > user joined the channel.
from the input above.
Most likely doesn't make any sense to you but this is just an example regex similar to one I've dealt with recently. Please keep in mind I'm NOT looking for a solution for this particular problem or similar problems with the Unix tools I listed above. I'm looking for a command line tool which can use unmodified "search for" and replacement strings that I use in a text editor (Geany, in particular but that shouldn't really matter) without complicated syntax or requiring some added programming logic to deal with the multiline "search for" and replacement strings.
Best Answer
I'm not sure why Perl isn't acceptable here. On the inputs you provided, this line gives the output you asked for:
The
-e
argument is exactly the first argument to yourmagicregextool
except that I added the/mg
regex modifier. This may not be "unmodified" but it doesn't seem unreasonable either. If you don't want to type in the whole line, how about this script asmagicregextool
:Or even:
Then you just type:
Which is the same as your sample (again other than adding the
/mg
modifier).An additional benefit to this is that if you are running multiple related search/replace operations on each file, you can put them together in the same script: