How to Remove Everything Between Two Characters with SED

bashlinuxsed

How do you remove all the text BETWEEN two Characters using sed…

Eg:

00arbez+15611@hotmail.com
00aryapan+kee45j@rediffmail.com
asghrsha+hfcdedd@yahoo.com

I want to remove the text + to @ in the email. (Even the + needs to be deleted, and the symbol @ needs to retain)

I used the following command:

sed -e 's/\(+\).*\(@\)/\1\2/' FILE.txt > RESULT.txt

But the output of the file includes "+" sign in it.
Eg: asghrsha+@yahoo.com

I want the following output:

00arbez@hotmail.com
00aryapan@rediffmail.com
asghrsha@yahoo.com

Can someone help me with modifying the above sed command?

Best Answer

The simple solution is to match the one(s) you want to keep around the boundary of the match, and put them back with nothing between them.

sed 's/+[^@+]*@/@/' FILE.txt >RESULT.txt

You were putting back stuff you didn't want to keep, which obviously produces the wrong result.

You can capture the string you want to keep using \( ... \) grouping parentheses, but in this case, since it's a completely static string, I opted to keep the regex and the replacement string as simple as possible, and just hardcode @ as the replacement string.

Notice also how the regex takes care not to straddle across multiple plus signs or @ signs. Maybe you do want to straddle any repeated + characters actually; then take out the plus from the negated character class, leaving only [^@].

Related Question