JSON has become the lingua franca of the computer world for sharing information between applications. Just about anyone who writes code to interact with Web APIs or retrieve the results of an application will need to know how to parse JSON. Luckily, due to JSON’s popularity, it has wide support and there are many packages, like JMESPath, available to help parse complex JSON structures.

Frequently when I am deploying or updating infrastructure using Ansible I have to parse JSON results from a cloud provider or the output of a command like kubectl when interacting with Kubernetes. The output from these sources often contains a lot of information, parsing all of that information just to get what needed and transforming it into a usable format is often difficult.

In the following example output from kubectl get node node-name -o json , there isn’t an easy way to get the status for the type Ready using Ansible’s native JSON parsing without looping through the list.

{
  "conditions": [
              {
                  "status": "False",
                  "type": "NetworkUnavailable"
              },
              {
                 "status": "False",
                  "type": "MemoryPressure"
              },
              {
                  "status": "False",
                  "type": "DiskPressure"
              },
              {
                  "status": "False",
                  "type": "PIDPressure"
              },
              {
                 "status": "True",
                  "type": "Ready"
              }
  ]
}

However, Ansible has solved this problem by exposing the JMESPath JSON parsing library using the json_query filter.

In this example I will demonstrate how to use Ansible and JMESPath to parse complex JSON output and transform the results into a more usable format by using a key filter and zipping two JSON lists.


For this example, I am using OpenStack Heat to create an 8 node (3 MON nodes and 5 OSD nodes) Ceph cluster that will have Ceph installed using ceph-ansible. Since I am in the development and testing phase of this project, I am frequently creating and destroying the clusters. OpenStack Heat and Ansible do a good job on automating most of the creation and destroying steps, however there was still a manual step where I had to copy the hostnames and ip addresses for the nodes created by OpenStack Heat into Ansible’s inventory file. In order to fully automate the process, I had to capture the output from OpenStack Heat in Ansible so I could automatically generate the inventory file using an Ansible template.

To make the inventory file it mean converting this JSON:

{
    "stack_create.stack.outputs": [
        {
            "description": "Ceph osd management addresses",
            "output_key": "ceph_osd_management_addresses",
            "output_value": [
                "192.168.0.95",
                "192.168.0.101",
                "192.168.0.155",
                "192.168.0.161",
                "192.168.0.23"
            ]
        },
        {
            "description": "Ceph osd server names",
            "output_key": "ceph_osd_server_names",
            "output_value": [
                "ceph-osd-0",
                "ceph-osd-1",
                "ceph-osd-2",
                "ceph-osd-3",
                "ceph-osd-4"
            ]
        },
        {
            "description": "Ceph mon management addresses",
            "output_key": "ceph_mon_management_addresses",
            "output_value": [
                "192.168.0.117",
                "192.168.0.240",
                "192.168.0.44"
            ]
        },
        {
            "description": "Ceph mon server names",
            "output_key": "ceph_mon_server_names",
            "output_value": [
                "ceph-mon-0",
                "ceph-mon-1",
                "ceph-mon-2"
            ]
        }
    ]
}

Into the following inventory file:

[mons]
ceph-mon-0 ansible_host=192.168.0.117 
ceph-mon-1 ansible_host=192.168.0.240 
ceph-mon-2 ansible_host=192.168.0.44
[osds]
ceph-osd-0 ansible_host=192.168.0.95 
ceph-osd-1 ansible_host=192.168.0.101 
ceph-osd-2 ansible_host=192.168.0.155
ceph-osd-3 ansible_host=192.168.0.161
ceph-osd-4 ansible_host=192.168.0.23

In general, Ansible has good native parsing of JSON, however Ansible’s native JSON parsing cannot handle the case, like above, when you need to filter a list of JSON objects based on the value of a key that is in one of the JSON objects. For example, in the above JSON output, I want the list of ip address in the key output_value where output_key = ceph_mon_management_addresses. The best that can be done with Ansible’s native JSON parsing is stack_create.stack.outputs[2].output_value , but that would require the ceph_mon_management_addresses to always be the 3rd item in the list, which cannot be guaranteed.

Here is where Ansible’s json_query filter comes in. With JMESPath we can search a list of objects for a key value pair but return the value of another key in the same object. In practical terms for this example, we can search the list of objects for the object where output_key = ceph_mon_management_addresses and return the value of output_value. Here is the Ansible set_fact task using a JMESPath query to get the result:

- name: Create a list of mon ip addresses
  set_fact:
     mon_ips: "{{ stack_create | json_query(\"stack.outputs[?output_key == ‘ceph_mon_management_addresses’].output_value\") }}"

In this example, the search for the object that contains output_key == ‘ceph_mon_management_addresses’ is done using the JMESPath filter projection (?) with the statement from above. Then we append .output_value to return the value of the output_value key. The results will look like:

[
  [
    "192.168.0.117",
    "192.168.0.240",
    "192.168.0.44"
  ]
]

Since JMESPath is preserving the original format of the JSON, there are two nested lists, the list of objects and the list of ip addresses. We only want a list of ip addresses, therefore we can apply the JMESPath flatten projection to get the output we want. Simply add [] to the end of the statement like so:

- name: Create a list of mon ip addresses
  set_fact:
     mon_ips: "{{ stack_create | json_query(\"stack.outputs[?output_key == ‘ceph_mon_management_addresses’].output_value[]\") }}"

#parse #json #ansible #jmespath #parsing

Complex JSON parsing with Ansible and JMESPath
2.40 GEEK