Use of alternation “|” in sed’s regex

regexsed

I am using sed, GNU sed version 4.2.1.
I want to use the alternation "|" symbol in a subexpression.
For example :

echo "blia blib bou blf" | sed 's/bl\(ia|f\)//g'

should return

" blib bou "

but it returns

"blia blib bou blf".

How can I have the expected result ?

Best Answer

The "|" also needs a backslash to get its special meaning.

echo "blia blib bou blf" | sed 's/bl\(ia\|f\)//g'

will do what you want.

As you know, if all else fails, read the manual :-).

GNU sed user's manual, section 3.3 Overview of Regular Expression Syntax:

`REGEXP1\|REGEXP2'

Matches either REGEXP1 or REGEXP2.

Note the backslash...

Unfortunately, regex syntax is not really standardized... there are many variants, which differ among other things in which "special characters" need \ and which do not. In some it's even configurable or depends on switches (as in GNU grep, which you can switch between three different regex dialects).

This answer in particular is for GNU sed. There are other sed variants, for example the one used in the BSDs, which behave differently.

Related Question