Bash – Does bash support back references in parameter expansion

bashregular expressionstringvariable substitution

I have a variable named descr which can contain a string Blah: -> r1-ae0-2 / [123], -> s7-Gi0-0-1:1-US / Foo, etc. I want to get the -> r1-ae0-2, -> s7-Gi0-0-1:1-US part from the string. At the moment I use descr=$(grep -oP '\->\s*\S+' <<< "$descr" for this. Is there a better way to do this? Is it also possible to do this with parameter expansion?

Best Answer

ksh93 and zsh have back-reference (or more accurately¹, references to capture groups in the replacement) support inside ${var/pattern/replacement}, not bash.

ksh93:

$ var='Blah: -> r1-ae0-2 / [123]'
$ printf '%s\n' "${var/*@(->*([[:space:]])+([^[:space:]]))*/\1}"
-> r1-ae0-2

zsh:

$ var='Blah: -> r1-ae0-2 / [123]'
$ set -o extendedglob
$ printf '%s\n' "${var/(#b)*(->[[:space:]]#[^[:space:]]##)*/$match[1]}"
-> r1-ae0-2

(mksh man page also mentions that future versions will support it with ${KSH_MATCH[1]} for the first capture group. Not available yet as of 2017-04-25).

However, with bash, you can do:

$ [[ $var =~ -\>[[:space:]]*[^[:space:]]+ ]] &&
  printf '%s\n' "${BASH_REMATCH[0]}"
-> r1-ae0-2

Which is better as it checks that the pattern is found first.

If your system's regexps support \s/\S, you can also do:

re='->\s*\S+'
[[ $var =~ $re ]]

With zsh, you can get the full power of PCREs with:

$ set -o rematchpcre
$ [[ $var =~ '->\s*\S+' ]] && printf '%s\n' $MATCH
-> r1-ae0-2

With zsh -o extendedglob, see also:

$ printf '%s\n' ${(SM)var##-\>[[:space:]]#[^[:space:]]##}
-> r1-ae0-2

Portably:

$ expr " $var" : '.*\(->[[:space:]]*[^[:space:]]\{1,\}\)'
-> r1-ae0-2

If there are several occurrences of the pattern in the string, the behaviour will vary with all those solutions. However none of them will give you a newline separated list of all matches like in your GNU-grep-based solution.

To do that, you'd need to do the looping by hand. For instance, with bash:

re='(->\s*\S+)(.*)'
while [[ $var =~ $re ]]; do
  printf '%s\n' "${BASH_REMATCH[1]}"
  var=${BASH_REMATCH[2]}
done

With zsh, you could resort to this kind of trick to store all the matches in an array:

set -o extendedglob
matches=() n=0
: ${var//(#m)->[[:space:]]#[^[:space:]]##/${matches[++n]::=$MATCH}}
printf '%s\n' $matches

¹ back-references does more commonly designate a pattern that references what was matched by an earlier group. For instance, the $.$\1 basic regular expression matches a single character followed by that same character (it matches on aa, not on ab). That \1 is a back-reference to that $.$ capture group in the same pattern.

ksh93 does support back-references in its patterns (for instance ls -d -- @(?)\1 will list the file names that consist of two identical characters), not other shells. Standard BREs and PCREs support back-references but not standard ERE, though some ERE implementations support it as an extension. bash's [[ foo =~ re ]] uses EREs.

[[ aa =~ (.)\1 ]]

will not match, but

re='(.)\1'; [[ aa =~ $re ]]

may if the system's EREs support it.

Related Solutions

How to Prevent Parameter Expansion in Bash

You have mistaken, double quotes "$b" does not prevent parameter expansion, it prevents pathname expansion (aka globbing) and fields splitting.

If you want to prevent parameter expansion, you need to use quoting, like single quote '$b' or escaping \$b:

$ echo '$b'

or:

$ echo \$b

then $b is output literal.

In the example, there's nothing prevent parameter expansion.

When the shell read the input c=$(b=2; echo $b), it perform token recognition, saw that $( is token for command substitution. So it treats the rest of string between $( and ) to be interpreted in subshell created by command substitution, not the current shell.

Zsh – Parameter Expansion Techniques

fpath is an array. You should not try to demote it to a string and then replace characters in it with newlines.

With ksh-like array expansion syntax:

$ printf '%s\n' "${fpath[@]}"
/usr/local/share/zsh/site-functions
/usr/share/zsh/site-functions
/usr/share/zsh/5.7.1/functions

With zsh, you can also use "$fpath[@]" or "${(@)fpath}".

You can also do:

$ printf '%s\n' $fpath
/usr/local/share/zsh/site-functions
/usr/share/zsh/site-functions
/usr/share/zsh/5.7.1/functions

But note that it skips empty elements of the array (likely not a problem for $fpath).

zsh's print builtin can also print elements one per line with the -l option. Like in ksh where the print builtin comes from, you do need the -r option though to print arbitrary data raw, so:

print -rl -- $fpath

would be equivalent to the standard printf '%s\n'.

They differ from -C1 when $fpath is an empty list in which case print -rC1 -- $fpath outputs nothing while print -rl and printf '%s\n' output one empty line, so

print -rC1 -- "$array[@]"

is probably the closest to what you want to print elements of the array one per line which in bash/ksh you could write:

[ "${#array[@]}" -eq 0 ] || printf '%s\n' "${array[@]}"

Best Answer

Related Solutions

How to Prevent Parameter Expansion in Bash

Zsh – Parameter Expansion Techniques

Related Question