Sed: delete text between a string until first occurrence of another string

regexsed

Imagine I have something like the following text:

The quick brown fox jumps in 2012 and 2013

And I would wish to delete the part from "fox" including the four numbers but only in the first occurrence so I end up with:

The quick brown and 2013

Something likes this…:

echo "The quick brown fox jumps in 2012 and 2013" \
   | sed  "s/fox.*\([0-9]\{4\}\)//g"

…brings me:

The quick brown

So it removed everything including the last occurrence of the four numbers.

Any ideas?

Best Answer

POSIX regular expressions used by sed (both the "basic" and "extended" versions) do not support non-greedy matches. (Although there are some workarounds, such as using [^0-9]* in place of .*, they become unreliable if the inputs vary a lot.)

What you need can be achieved in Perl by using the ? non-greedy quantifier:

echo "The quick brown fox jumps in 2012 and 2013" \
   | perl -pe 's/fox.*?([0-9]{4})//g'

You might wish to remove an extra space as well.

Related Solutions

Delete the first known character in a string with sed

sed 's/^@\(.*\)/\1/'

^ means beginning of the string

@ your known char

(.*) the rest, captured

then captured block will be substituted to output Sorry, can't test it at the moment, but should be something like that

Sed: replace only the first range of numbers

Use awk instead of sed:

awk -F"'" '{OFS="'"'"'"; $2=$2+10000; print}'

Best Answer

Related Solutions

Delete the first known character in a string with sed

Sed: replace only the first range of numbers

Related Question