Json queries with JMEPath and jq

From Ever changing code
Jump to navigation Jump to search
The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.

What is JSON

JSON (JavaScript Object Notation) is a lightweight data-interchange format. It is easy for humans to read and write. It is easy for machines to parse and generate. JSON is built on two structures:

  • A collection of name/value pairs. In various languages, this is realized as an object, record, struct, dictionary, hash table, keyed list, or associative array.
  • An ordered list of values. In most languages, this is realized as an array, vector, list, or sequence.
An object
is an unordered set of name/value pairs. An object begins with { (left brace) and ends with } (right brace). Each name is followed by : (colon) and the name/value pairs are separated by , (comma).
An array
is an ordered collection of values. An array begins with [ (left bracket) and ends with ] (right bracket). Values are separated by , (comma).
A value
can be a string in double quotes, or a number, or true or false or null, or an object or an array. These structures can be nested.
A string
is a sequence of zero or more Unicode characters, wrapped in double quotes, using backslash escapes. A character is represented as a single character string. A string is very much like a C or Java string.
A number
is very much like a C or Java number, except that the octal and hexadecimal formats are not used.

Interactive tools

jqplay online

Online jq playground.

Jq playground

Jiq

Install Jiq

  1. Install jq
  2. Install Golang <1.7
  3. Export environment variables eg. to .bashrc or .profile files.
    export GOPATH=/home/ubuntu/.local/go
    export PATH="$PATH:$HOME/.local/go"
  1. Install jiq
    go get github.com/fiatjaf/jiq/cmd/jiq #required sudo in Ubuntu 18.04 WSL 
    export PATH=$PATH:~/go/bin            #location where go-golang Ubuntu package installs go-lang binaries

jmespath.terminal

This is jmespath.terminal git project without latest activity but still works.

jq

Turn JSON into CSV

aws ec2 describe-snapshots --owner-ids 777781152222 --region us-east-1 --profile prod > prod_snapshots.json
cat prod_snapshots.json \
  | jq -r '.Snapshots[] | [.Description,.Encrypted,.VolumeId,.State,.VolumeSize,.StartTime,.Progress,.OwnerId,.SnapshotId] | @csv' \
  > prod_snapshots.csv
# by using filter '.Snapshots[]' you select "Snapshots" array every object, then
# create an array with [.key1,.key2] as array is required data structure to turn into CSV
# by default double quote will be escaped ' \" ' using '-r' prevents doing it making easier to open in Excel.


Json blob

{
    "Snapshots": [
        {
            "Description": "Copied for DestinationAmi ami-85921111 from SourceAmi ami-e3045222 for SourceSnapshot snap-0ebc77af111b09222. Task created on 1,487,109,468,219.",
            "Encrypted": false,
            "VolumeId": "vol-ffffffff",
            "State": "completed",
            "VolumeSize": 8,
            "StartTime": "2017-02-14T21:57:52.000Z",
            "Progress": "100%",
            "OwnerId": "3335812595337",
            "SnapshotId": "snap-58862222"
        },
        {
            "Description": "Copied for DestinationAmi ami-ccc875a8 from SourceAmi ami-cfda333 for SourceSnapshot snap-71347333. Task created on 1,485,867,366,917.",
            "Encrypted": false,
            "VolumeId": "vol-ffffffff",
            "State": "completed",
            "VolumeSize": 60,
            "StartTime": "2017-01-31T12:56:11.000Z",
            "Progress": "100%",
            "OwnerId": "222557238111",
            "SnapshotId": "snap-2314333"
        }
    ]
}

Display only Keys that mach value

Extract AWS volumes

aws ec2 describe-volumes --region us-east-1 --profile prod > prod_volumes.json
cat prod_volumes.json \
  | cat ftr_volumes_all.json \
  | jq --raw-output '.Volumes[] | [ (.Tags[]? | select(.Key=="Name").Value, select(.Key=="Backup").Value), .VolumeId, .State, .SnapshotId, .Attachments[].InstanceId //"notAttached", .CreateTime, .Size ] | @csv' \
  > ftr_volumes_all.csv \
  > prod_volumes.csv
  • .Tags[]? means that the list/object is optional, prevents jq: error - Cannot iterate over null (null) error
  • (.Tags[]? | select(.Key=="Name").Value, select(.Key=="Backup").Value) taking expression in brackets groups multiple filters into one, so it is not affecting next filter. Remember , coma feeds date through next filter here it's .VolumeId. This produces first 2 items on in the list, see below in results.
  • .Attachments[].InstanceId //"notAttached" 2 forward slashes set default value


Blob and results
Blob Result
{
    "Volumes": [
        {
            "AvailabilityZone": "us-east-1b", 
            "Attachments": [], 
            "Tags": [
                {
                    "Value": "Comms Scheduler", 
                    "Key": "Name"
                }
            ], 
            "Encrypted": false, 
            "VolumeType": "gp2", 
            "VolumeId": "vol-1111cd2b12e622222", 
            "State": "available", 
            "Iops": 300, 
            "SnapshotId": "snap-1111810031fa22222", 
            "CreateTime": "2018-06-20T15:29:32.333Z", 
            "Size": 100
        }, 
        {
            "AvailabilityZone": "us-east-1b",
            "Attachments": [
                {
                    "AttachTime": "2018-06-27T11:13:30.000Z",
                    "InstanceId": "i-44441a50fc245555",
                    "VolumeId": "vol-77770ff240e67777",
                    "State": "attached",
                    "DeleteOnTermination": true,
                    "Device": "/dev/sda1"
                }
            ],
            "Tags": [
                {
                    "Value": "apache",
                    "Key": "Name"
                },
                {
                    "Value": "False",
                    "Key": "Backup"
                }
            ],
            "Encrypted": true,
            "VolumeType": "gp2",
            "VolumeId": "vol-123123ff240e62222",
            "State": "in-use",
            "KmsKeyId": "arn:aws:kms:us-west-1:0123481158111:key/40222222-4455-7777-8888-c06f27d59999",
            "SnapshotId": "snap-999947cb7448d8888",
            "Iops": 100,
            "CreateTime": "2018-06-27T11:13:30.582Z",
            "Size": 8
        }
    ]
}
[
  "Comms Scheduler",
  "backupFalse",
  "vol-1111cd2b12e622222",
  "available",
  "snap-1111810031fa22222",
  "notAttached",
  "2018-06-20T15:29:32.333Z",
  100
]
[
  "apache",
  "backupFalse",
  "False",
  "vol-123123ff240e62222",
  "in-use",
  "snap-999947cb7448d8888",
  "i-44441a50fc245555",
  "2018-06-27T11:13:30.582Z",
  8
]

Filtered list of instance name, type, virtualization type and instance id

At the beginning my simple convention for Json is [] is an array and {} is an object.

Get an JSON output of all instances in your AWS accounty

aws ec2 describe-instances --output json | jiq #pipe into jiq
#jiq's expressions line
Filter]> .Reservations[].Instances[] | {_name: .Tags[], _id: .InstanceId, _type: .InstanceType, _virtType: .Hypervisor }

.Reservations[].Instances[] returns all data that we need, therefore we pipe into new object { _key1: XXX, _key2: YYY }. Actually, the keys fields we define now, as: _name, _id, _id, _type.

Next, let's select only Tags we are interested in, from a list(object). Therefore, we pipe Tags via select function. Where the Tag name is Name and we want get .Value field.

{name: .Tags[] | select(.Key=="Name").Value

Full filter command:

[Filter]> .Reservations[].Instances[] | {name: .Tags[] | select(.Key=="Name").Value, id: .InstanceId, type: .InstanceType, virt_type: .Hypervisor }

{                              
  "_name": "box-2.prod1.example.com",
  "_id": "i-ecccc666",                                                          
  "_type": "c4.2xlarge",
  "_virt_type": "xen"   
}                            
{           
  "_name": "db-2.prod1.example.com",
  "_id": "i-0aaaaae989a68b6e6",                     
  "_type": "m4.large",                        
  "_virt_type": "xen"                          
}

Now, we want to display only records, that match our instance name using REGEX expression .*box.*

{_name: .Tags[] | select(.Key=="Name").Value | match(".*box.*")
[Filter]> .Reservations[].Instances[] | {_name: .Tags[] | select(.Key=="Name").Value | match(".*box.*") , _id: .InstanceId, _type: .InstanceType, _virt_type: .Hypervisor }

{                                                                                                                                                           
  "_name": {                                                                                                                                          
    "offset": 0,                                                                                                         
    "length": 25,                                                                                                                                  
    "string": "box-2.prod1.example.com",
    "captures": []       
  },                           
  "_id": "i-0c26072b64b6404dc",                            
  "_type": "m4.large",                                                          
  "_virt_type": "xen"   
}                      
{                            
  "_name": { 
    "offset": 0,                                         
    "length": 24,                                  
    "string": "db-2.prod1.example.com", 
    "captures": []                            
  },                         
  "_id": "i-0a6f8fe989a68b6e6",    
  "_type": "m4.large",                                           
  "_virt_type": "xen"                          
}

The output contains now extra info. Butm but we are only interested in string field

{_name: .Tags[] | select(.Key=="Name").Value | match(".*box.*").string

Let's sort out the output by _id field. Note, only arrays [] aka lists can be sorted using sort_by</function>. That's why we need to wrap our object {} into brackets [].

[Filter]> [.Reservations[].Instances[] | {_name: .Tags[] | select(.Key=="Name").Value | match(".*box.*").string, _id: .InstanceId, _type: .InstanceType, _virt_type: .Hypervisor }] | sort_by(._id)

Real use:

$ aws ec2 describe-instances --filters "Name=tag:Name,Values=*mft*" "Name=instance-state-code,Values=16" | jq -r '[.Reservations[].Instances[] | {_name: .Tags[] | select(.Key=="Name").Value | match(".*box.*").string, _id: .InstanceId, _type: .InstanceType, _virt_type: .Hypervisor }] | sort_by(._id)'

Awscli queries

$ aws ec2 describe-instances \
  --output table \
  --query 'Reservations[].Instances[].[Tags[?Key==`Name`] | [0].Value,InstanceId,InstanceType]'

And we get a nicely formatted table:
-----------------------------------------------
|              DescribeInstances              |
+----------------+--------------+-------------+
|  xxxxxxxxxx    |  i-a0169xxx  |  r3.large   |
|  yyyyyyyyyy    |  i-11a46xxx  |  m3.large   |
|  zzzzzzzzzzzzzz|  i-07c4axxx  |  t2.medium  |
+----------------+--------------+-------------+

References