I have a file that is built like this:
(MATH[NUMBER1=(50, unknown, unknown), NUMBER2=(unknown, 4, unknown), OPERATOR='times']
(NUM[SEM=(50, unknown, unknown)] (DIZAINE[SEM=50] cinquante))
(OPERATEUR[SEM='times'] multiplie)
(NUM[SEM=(unknown, 4, unknown)] (UNITE[SEM=4] quatre)))
How can I extract the values 50, 'times' & 4?
I've tried with awk but there are parentheses balancing issues
Best Answer
If you want to extract the non parenthesized values of the
SEM
attribute then you can do so usinggrep
in PCRE mode:or
perl
itselfBoth approaches use regular expression lookarounds.