Bash – What’s the right way to sort a associated array in bash or zsh

bashzsh

I'm wondering how should I sort the associated array in bash? I tried the manual, but seems nothing related to sort.

The current solution is echo everything out, and use external program i.e key value | sort -k2

That looks inefficient to me.

An example of array was:

A['192.168.2.2']=5
A['192.168.3.2']=1
A['192.168.1.1']=9

And I'll be looking for the top 2 used IP address, which is 192.168.1.1 and 192.168.2.2, that is, I need to sort this array by it's value.

Best Answer

Zsh has a built-in way to sort lists. However, I don't think there's a way to sort the values while keeping the correlation with the keys using parameter expansion flags and subscript flags, which means that an explicit loop is necessary. Assuming that your values don't contain a null character, you can build an array containing the values and keys concatenated with a null character in between, and sort that.

keys=("${(@k)A}")
values=("${(@v)A}")
combined=()
for ((i=1; i <= $#values; i++)) { combined[i]=($values[i]$'\0'$keys[i]); }
keys_sorted_by_decreasing_value=("${${(@On)combined}#*$'\0'}")
keys_of_the_top_two_values=("${(@)keys_sorted_by_decreasing_value[1,2]}")

EDIT by @sch: the first 4 lines can be simplified to

combined=()
for k v ("${(@kv)A}") combined+=($k$'\0'$v)

The variables keys and values contain the keys and values of A in an arbitrary but consistent order. You can write keys=(${(k)A}) if there are no empty keys, and similarly for values. keys_sorted_by_decreasing_value sorts keys lexicographically, add the n flag to sort numerically (9 before 10) and remove O if you want to sort in increasing order (in which case the top two values can be obtained with the subscript [-2,-1]).

Ksh93 has a way to sort the positional parameters only, with set -s; this also exists in zsh but not in bash 4.2. Assuming your values don't contain newlines or control characters that sort before newlines:

keys=("${!A[@]}")
combined=()
for ((i=0; i <= ${#keys}; i++)); do combined[i]=(${A[${keys[$i]}]}$'\n'${keys[$i]}); done
set -A sorted -s "${combined[@]}"
top_combined=${sorted[${#sorted[@]}-1]}  # -2 for the next-to-largest, etc.
top_key=${top_combined#*$'\n'}

This is all pretty complex, so you might as well go for the external sort, which is a lot easier to write. Assuming that neither keys nor values contain control characters, in ksh or bash:

IFS=$'\n'; set -f
keys_sorted_by_decreasing_value=($(
    for k in "${!A[@]}"; do printf '%s\t%s\n' "${A[$k]}" "$k"; done |
    sort | sed $'s/\t.*//'
  ))
Related Question