Shell – Extract text including parens

sedshell-scripttext processing

I have some text like this:

Sentence #1 (n tokens):
Blah Blah Blah
[...
 ...
 ...]
( #start first set here
 ... (other possible parens and text here)
 ) #end first set here

(...)
(...)

Sentence #2 (n tokens):

I want to extract the second set of parens (including everything in between) ,i.e.,

(
 ... (other possible parens here)
)

Is there a bash way to do this. I tried the simple

 's/(\(.*\))/\1/'

Best Answer

This will do it. There's probably a better way, but this is the first approach that came to mind:

echo 'Sentence #1 (n tokens):
Blah Blah Blah
[...
 ...
 ...]
(
 ... (other possible parens here)
 )

(...)
(...)

Sentence #2 (n tokens):
' | perl -0777 -nE '
    $wanted = 2; 
    $level = 0; 
    $text = ""; 
    for $char (split //) {
        $level++ if $char eq "(";
        $text .= $char if $level > 0;
        if ($char eq ")") {
            if (--$level == 0) {
                if (++$n == $wanted) { 
                    say $text;
                    exit;
                }
                $text="";
            }
        }
    }
'

outputs

(
 ... (other possible parens here)
 )
Related Question