Bash – parameter substitution/expansion alternative for “| cut -f1,2,3 -d:” a.k.a. trim after and including n-th character occurence

bash-expansioncutstring

An ancient version of ipconfig (inside initramfs) requires its user input to supply only up to 7 colon separated elements, like:

ip=client-ip:server-ip:gw-ip:netmask:hostname:device:autoconf

result in an ipconfig error when users do supply more than 7 elements.

Therefore the extra (2 DNS resolvers) should be chopped off.

That can be done inside a subshell with cut, like:

validated_input=$(echo ${user_input} | cut -f1,2,3,4,5,6,7 -d:)

How can such cut be written using (b)ash parameter expansion/substitution?

Without:

  • launching subshell(s)/subprocess(es) (piping)
  • IFS-wrangling/mangling

Because of (1) speed, see Using bash variable substitution instead of cut/awk, and (2) learning.


In other words: How to do a lookup for n-th (7-th) character occurrence and remove/trim everything from there until the end of the string?

Best Answer

This uses only parameter expansion:

${var%:"${var#*:*:*:*:*:*:*:}"}

Example:

$ var=client-ip:server-ip:gw-ip:netmask:hostname:device:autoconf:morefields:another:youwantanother:haveanother:
$ echo "${var%:"${var#*:*:*:*:*:*:*:}"}"
client-ip:server-ip:gw-ip:netmask:hostname:device:autoconf

Thanks ilkkachu for coming up with a fix to the trailing :!


${parameter#word}
${parameter##word}

The word is expanded to produce a pattern just as in filename expansion (see Filename Expansion). If the pattern matches the beginning of the expanded value of parameter, then the result of the expansion is the expanded value of parameter with the shortest matching pattern (the ‘#’ case) or the longest matching pattern (the ‘##’ case) deleted. If parameter is ‘@’ or ‘’, the pattern removal operation is applied to each positional parameter in turn, and the expansion is the resultant list. If parameter is an array variable subscripted with ‘@’ or ‘’, the pattern removal operation is applied to each member of the array in turn, and the expansion is the resultant list.

This will attempt to match the beginning of your parameter, and if it does it will strip it.

Example:

$ var=a:b:c:d:e:f:g:h:i
$ echo "${var#a}"
:b:c:d:e:f:g:h:i
$ echo "${var#a:b:}"
c:d:e:f:g:h:i
$ echo "${var#*:*:}"
c:d:e:f:g:h:i
$ echo "${var##*:}"    # Two hashes make it greedy
i

${parameter%word}
${parameter%%word}

The word is expanded to produce a pattern just as in filename expansion. If the pattern matches a trailing portion of the expanded value of parameter, then the result of the expansion is the value of parameter with the shortest matching pattern (the ‘%’ case) or the longest matching pattern (the ‘%%’ case) deleted. If parameter is ‘@’ or ‘’, the pattern removal operation is applied to each positional parameter in turn, and the expansion is the resultant list. If parameter is an array variable subscripted with ‘@’ or ‘’, the pattern removal operation is applied to each member of the array in turn, and the expansion is the resultant list.

This will attempt to match the end of your parameter, and if it does it will strip it.

Example:

$ var=a:b:c:d:e:f:g:h:i
$ echo "${var%i}"
a:b:c:d:e:f:g:h:
$ echo "${var%:h:i}"
a:b:c:d:e:f:g
$ echo "${var%:*:*}"
a:b:c:d:e:f:g
$ echo "${var%%:*}"    # Two %s make it greedy
a

So in the answer:

${var%:"${var#*:*:*:*:*:*:*:}"}

(note the quotes around ${var#...} so that it is treated as a literal string (not a pattern) to be stripped off the end of $var).

When applied to:

var=client-ip:server-ip:gw-ip:netmask:hostname:device:autoconf:morefields:another:youwantanother:haveanother:

${var#*:*:*:*:*:*:*:} = morefields:another:youwantanother:haveanother:

That is expanded inside ${var%: ... } like so:

${var%:morefields:another:youwantanother:haveanother:}

So you are saying give me:

client-ip:server-ip:gw-ip:netmask:hostname:device:autoconf:morefields:another:youwantanother:haveanother:

But trim :morefields:another:youwantanother:haveanother: off the end.

The Bash Reference Manual (3.5.3)