JSON Parsing in Shell – Zsh and String Handling

jsonstringzsh

How can I parse JSON output on the shell?

For example, Amazon Web Services provides a CLI to retrieve the status of your instances:

$ aws ec2 describe-instances <my_instance_id>

But the command returns a JSON string. The output of that command looks like this:

$ aws ec2 describe-instances x12345
{
    "Reservations" :
     {  
            "OwnerId": "1345345"
            "Groups": [], 
            "SecurityGroups": [
               {
                  "Foo" : "yes"
                  "Bar" : "no
               }
             ]
     }
}

Are there an shells built-ins that could be used to parse JSON output?

For example, I would like to capture in a shell variable FOO, the following output["Reservations"]["SecurityGroups"][0]{"Foo"}.

In case it helps, I am specifically interested in solutions that could work from Zsh.

Best Answer

As I understand it you're looking for the value of "Foo". This is really easy to do with the shell command-line tool jq. It is something like sed in that it implements its own kind of parser language. Given your example:

json='
{
    "Reservations" :
     {  
            "OwnerId" : "1345345",
            "Groups" :  [],
            "SecurityGroups" : [
               {
                  "Foo" : "yes",
                  "Bar" : "no"
               }
             ]
     }
}'

jq can get yes as simply as:

printf %s "$json" |
jq '.[].SecurityGroups[0].Foo?'                                                

OUTPUT

"yes"

You can walk through an object hash or dictionary list using the .dot notation, and indexed arrays can be indexed more simply, with, as you have probably guessed, numeric, square-bracketed indices. In the command above I use the empty index form to indicate that I want all of that level's iterable items expanded. That may be easier to understand in this way:

printf %s "$json" | jq '.[][]'

... which breaks out all values for the second level items in the hash and gets me...

"1345345"
[]
[
  {
    "Foo": "yes",
    "Bar": "no"
  }
]

This barely scratches the surface with regards to jq's capabilities. It is an immensely powerful tool for serializing data in the shell, it compiles to a single executable binary in the classic Unix-style, it is very likely available via package-manager for your distribution, and it is very well documented. Please visit its git-page and see for yourself.

By the way, another way to tackle layered-data in json - at least to get an idea of what you're working with - might be to go the other way and use the .dot notation to break out all values at all levels like:

printf %s "$json" | jq '..'

{
  "Reservations": {
    "OwnerId": "1345345",
    "Groups": [],
    "SecurityGroups": [
      {
        "Foo": "yes",
        "Bar": "no"
      }
    ]
  }
}
{
  "OwnerId": "1345345",
  "Groups": [],
  "SecurityGroups": [
    {
      "Foo": "yes",
      "Bar": "no"
    }
  ]
}
"1345345"
[]
[
  {
    "Foo": "yes",
    "Bar": "no"
  }
]
{
  "Foo": "yes",
  "Bar": "no"
}
"yes"
"no"

But far better, probably, would be just to use one of the many discovery or search methods that jq offers for the various types of nodes.

Related Question