Let's say I have an associative array in bash
,
declare -A hash
hash=(
["foo"]=aa
["bar"]=bb
["baz"]=aa
["quux"]=bb
["wibble"]=cc
["wobble"]=aa
)
where both keys and values are unknown to me (the actual data is read from external sources).
How may I create an array of the keys corresponding to the same value, so that I may, in a loop over all unique values, do
printf 'Value "%s" is present with the following keys: %s\n' "$value" "${keys[*]}"
and get the output (not necessarily in this order)
Value "aa" is present with the following keys: foo baz wobble
Value "bb" is present with the following keys: bar quux
Value "cc" is present with the following keys: wibble
The important bit is that the keys are stored as separate elements in the keys
array and that they therefore do not need to be parsed out of a text string.
I could do something like
declare -A seen
seen=()
for value in "${hash[@]}"; do
if [ -n "${seen[$value]}" ]; then
continue
fi
keys=()
for key in "${!hash[@]}"; do
if [ "${hash[$key]}" = "$value" ]; then
keys+=( "$key" )
fi
done
printf 'Value "%s" is present with the following keys: %s\n' \
"$value" "${keys[*]}"
seen[$value]=1
done
But it seems a bit inefficient with that double loop.
Is there a piece of array syntax that I've missed for bash
?
Would doing this in e.g. zsh
give me access to more powerful array manipulation tools?
In Perl, I would do
my %hash = (
'foo' => 'aa',
'bar' => 'bb',
'baz' => 'aa',
'quux' => 'bb',
'wibble' => 'cc',
'wobble' => 'aa'
);
my %keys;
while ( my ( $key, $value ) = each(%hash) ) {
push( @{ $keys{$value} }, $key );
}
foreach my $value ( keys(%keys) ) {
printf( "Value \"%s\" is present with the following keys: %s\n",
$value, join( " ", @{ $keys{$value} } ) );
}
But bash
associative arrays can't hold arrays…
I'd also be interested in any old school solution possibly using some form of indirect indexing (building a set of index array(s) when reading the values that I said I had in hash
above?). It feels like there ought to be a way to do this in linear time.
Best Answer
zsh
to reverse keys <=> values
In
zsh
, where the primary syntax for defining a hash ishash=(k1 v1 k2 v2...)
like inperl
(newer versions also support the awkward ksh93/bash syntax for compatibility though with variations when it comes to quoting the keys)or using a loop:
The
@
and double quotes is to preserve empty keys and values (note thatbash
associative arrays don't support empty keys). As the expansion of elements in associative arrays is in no particular order, if several elements of$hash
have the same value (which will end up being a key in$reversed
), you can't tell which key will be used as the value in$reversed
.for your loop
You'd use the
R
hash subscript flag to get elements based on value instead of key, combined withe
for exact (as opposed to wildcard) match, and then get the keys for those elements with thek
parameter expansion flag:your perl approach
zsh
(contrary toksh93
) doesn't support arrays of arrays, but its variables can contain the NUL byte, so you could use that to separate elements if the elements don't otherwise contain NUL bytes, or use the${(q)var}
/${(Q)${(z)var}}
to encode/decode a list using quoting.ksh93
ksh93 was the first shell to introduce associative arrays in 1993. The syntax for assigning values as a whole means it's very difficult to do it programmatically contrary to
zsh
, but at least it's somewhat justified in ksh93 in thatksh93
supports complex nested data structures.In particular, here ksh93 supports arrays as values for hash elements, so you can do:
bash
bash
added support for associative arrays decades later, copied the ksh93 syntax, but not the other advanced data structures, and doesn't have any of the advanced parameter expansion operators of zsh.In
bash
, you could use the quoted list approach mentioned in the zsh usingprintf %q
or with newer versions${var@Q}
.As noted earlier however,
bash
associative arrays don't support the empty value as a key, so it won't work if some of$hash
's values are empty. You could choose to replace the empty string with some place holder like<EMPTY>
or prefix the key with some character that you'd later strip for display.