Korn Shell – Iterate Through Array of Strings with Regex

aixarraygrepkshregular expression

I have an array of strings called names containing names with some subsequent garbage data. Like this

Jill Shortz, City Contractor, America
Bill Torts, Family Doctor, Canada
Will Courtz, Folk DJ, Bulgaria
Phil-Lip Warts, Juggler, India

I want to iterate through names extracting only the first two words with the regex (^\w+-*( *\w+)*) and overwriting them back into names so it will contain

Jill Shortz
Bill Torts
Will Courtz
Phil-Lip Warts

this is how I attempted it but my AIX machine does not like the -P argument to execute in Perl mode

for((i=0;i<${#names[@]};++i)); do
        names[$i]=`grep -P '(^\w+-*( *\w+)*)' -o <<<"${names[i]}"`
done

Best Answer

I don't see anywhere in the ksh man page that you can match a string against a regular expression, and use capturing parentheses to extract substrings (like you would do in bash with

[[ $str =~ ^([[:alnum:]]+([ -]+[[:alnum:]]+)+) ]] && echo "${BASH_REMATCH[1]}"

However, you can use extended regular expressions in glob patterns with ~(E:regex), so you can do this:

for n in "${names[@]}"; do
  # remove the pattern from the start of the string
  tmp=${n##~(E:\w+([ -]+\w+)*)}
  # and then remove what remained from the end of the string
  echo "[${n%$tmp}]"
done
[Jill Shortz]
[Bill Torts]
[Will Courtz]
[Phil-Lip Warts]

... and for maximum write-only unreadability

for n in "${names[@]}"; do
  echo "${n%${n##~(E:\w+([ -]+\w+)*)}}"
done
Related Question