I have a JSON file members.json
as below.
{
"took": 670,
"timed_out": false,
"_shards": {
"total": 8,
"successful": 8,
"failed": 0
},
"hits": {
"total": 74,
"max_score": 1,
"hits": [
{
"_index": "2000_270_0",
"_type": "Medical",
"_id": "02:17447847049147026174478:174159",
"_score": 1,
"_source": {
"memberId": "0x7b93910446f91928e23e1043dfdf5bcf",
"memberFirstName": "Uri",
"memberMiddleName": "Prayag",
"memberLastName": "Dubofsky"
}
},
{
"_index": "2000_270_0",
"_type": "Medical",
"_id": "02:17447847049147026174478:174159",
"_score": 1,
"_source": {
"memberId": "0x7b93910446f91928e23e1043dfdf5bcG",
"memberFirstName": "Uri",
"memberMiddleName": "Prayag",
"memberLastName": "Dubofsky"
}
}
]
}
}
I want to parse it using bash
script get only the list of field memberId
.
The expected output is:
memberIds
-----------
0x7b93910446f91928e23e1043dfdf5bcf
0x7b93910446f91928e23e1043dfdf5bcG
I tried adding following bash+python code to .bashrc
:
function getJsonVal() {
if [ \( $# -ne 1 \) -o \( -t 0 \) ]; then
echo "Usage: getJsonVal 'key' < /tmp/file";
echo " -- or -- ";
echo " cat /tmp/input | getJsonVal 'key'";
return;
fi;
cat | python -c 'import json,sys;obj=json.load(sys.stdin);print obj["'$1'"]';
}
And then called:
$ cat members.json | getJsonVal "memberId"
But it throws:
Traceback (most recent call last):
File "<string>", line 1, in <module>
KeyError: 'memberId'
Best Answer
If you would use:
you can inspect the structure of the nested dictonary
obj
and see that your original line should read:to the to that "memberId" element. This way you can keep the Python as a oneliner.
If there are multiple elements in the nested "hits" element, then you can do something like:
Chris Down's solution is better for finding a single value to (unique) keys at any level.
With my second example that prints out multiple values, you are hitting the limits of what you should try with a one liner, at that point I see little reason why to do half of the processing in bash, and would move to a complete Python solution.