Bash – RegEx in bash to extract string after the first delimiter

bashregular expressionshell

This question is not a duplicate of any other questions around here as I need a regex in bash with the =~ matching.

Assuming I have a string like

string="ananas1kiwi2apple1banana2tree"

The regEx I tried was

[[ $string =~ .*2([[:alnum:]]{1,}) ]] && subString=${BASH_REMATCH[1]}

which was supposed to match the occurrence of 2 and capture everything beyond that, which is returning me only tree (string after the 2nd match). My expected output is apple1banana2tree

I know am missing a simple construct but not exactly sure which. Am looking only for a pure bash regEx based solution. Also not any string manipulation which I know can be done by "{string#*2}"

Best Answer

Just match 2 and then capture everything beyond by .*:

[[ $string =~ 2(.*) ]] && echo "${BASH_REMATCH[1]}"

Example:

$ string="ananas1kiwi2apple1banana2tree"

$ [[ $string =~ 2(.*) ]] && echo "${BASH_REMATCH[1]}"
apple1banana2tree

What's wrong with your one:

  • .* is greedy, it is matching upto last 2 when you use .*2, to have non-greediness (as .*? is not available in ERE) use [^2]*2

  • Also {1,} is just +

So do:

[[ $string =~ [^2]*2([[:alnum:]]+) ]]

In any case, no need to match from the start, just do:

[[ $string =~ 2([[:alnum:]]+) ]]
Related Question