Perl for matching with regular expressions in Terminal

perlregexterminal.app

I'm trying to familiarize myself a little with Perl to use for regular expression searches in Terminal (Mac). Now, I'm not really looking to learn Perl rigourously, just trying to find out how to do some simple regular expressions.

But I can't figure out how to do this in Terminal:

I'd like to be able to match expressions over several lines, and I'll take HTML tags as an example. PLEASE NOTE, that the HTML tag is just an example of something to match, and specifically something that goes over multiple lines. Whether matching HTML with regular expressionS is a good idea or not is not the issue. I just want to understand the syntax of matching with Perl on the command line!

Say I want to match the entire ul tag here:

<ul>
 <li>item 1</li>
 <li>item 2</li>
</ul>

I would like to:

  1. Be able to match this in a file and output the match to the stdout (don't ask why, I would just want to to understand how it works :-))
  2. Be able to replace it with something else.

For matching, I found something like this (using 'start' and 'end' as an example here from a simple text file when I was testing, but please give the example for the ul tag instead:

perl -wnE 'say $1 if /(start(.*?)end)/' test.txt 

This matches a part, but only on one line. Surprisingly, adding the s at the end didn't work to make it "dotall" or "single-line mode", it still just matched one line…

For replacing, I tried something like this:

perl -pe 's/start(.*?)end/replacement text/'s test.txt

This didn't work either…

Best Answer

Well, here's a wikipedia page for matching or replacing with Perl one liners. I did this in Cygwin:

Perl can behave like grep or like sed.

The /s makes dot match new line.

The -0777 makes it apply the regular expression to the whole thing instead of line by line.

\n can match new line as well.

$ echo -e 'a\nb\nc\nd' | perl -0777 -pe 's/.*c//s'

d

user@comp ~
$ echo -e 'a\nb\nc\nd' | perl -pe 's/.*c//s'
a
b

d

Here is the other form, -ne with print $1:

user@comp ~
$ echo -e 'a\nb\nc\nd' | perl -ne 'print $1 if /(.*c)/s'
c
user@comp ~
$ echo -e 'a\nb\nc\nd' | perl -0777 -ne 'print $1 if /(.*c)/s'
a
b
c
user@comp ~
$

Also

$ echo xxx|perl -lne 'print ""'

Perl's equivalent of \0 or &, i.e. the whole match is $_ or to be able to put text before and after without a space, ${_}

$ echo xxx|perl -lne 'print "a${_}${_}a"'
axxxxxxa

and

$  echo xxx|perl -lpe 's/.*/a${_}${_}a"/'
axxxxxxa"

###Some further examples

$ cat t.t
<ul>
 <li>item 1</li>
 <li>item 2</li>
</ul>

$ perl -0777 -ne 'print $1 if /\<ul\>(.*?)\<\/ul>/s' t.t

 <li>item 1</li>
 <li>item 2</li>

user@comp ~
$ perl -0777 -ne 'print $1 if /(.*)/s' t.t
<ul>
 <li>item 1</li>
 <li>item 2</li>
</ul>

user@comp ~
$

An example of Global for the -ne one (change "if" to "while"):

$ echo -e 'bbb' | perl -0777 -ne 'print $1 while /(b)/sg'
bbb

For the -pe one, just add the g at the end (/sg or /gs, same thing):

$  echo -e 'aaa' | perl -0777 -pe 's/a/z/s'
zaa

user@comp ~
$  echo -e 'aaa' | perl -0777 -pe 's/a/z/sg'
zzz

Note- This question contrasts /s and -0777

Those print $1 examples don't show the whole line. this link https://dzone.com/articles/perl-as-a-better-grep has this example that does perl -wln -e "/RE/ and print;" foo.txt