Json queries with JMEPath and jq

From Ever changing code
Jump to navigation Jump to search

What is JSON

JSON (JavaScript Object Notation) is a lightweight data-interchange format. It is easy for humans to read and write. It is easy for machines to parse and generate. JSON is built on two structures:

  • A collection of name/value pairs. In various languages, this is realized as an object, record, struct, dictionary, hash table, keyed list, or associative array.
  • An ordered list of values. In most languages, this is realized as an array, vector, list, or sequence.
An object
is an unordered set of name/value pairs. An object begins with { (left brace) and ends with } (right brace). Each name is followed by : (colon) and the name/value pairs are separated by , (comma).
An array
is an ordered collection of values. An array begins with [ (left bracket) and ends with ] (right bracket). Values are separated by , (comma).
A value
can be a string in double quotes, or a number, or true or false or null, or an object or an array. These structures can be nested.
A string
is a sequence of zero or more Unicode characters, wrapped in double quotes, using backslash escapes. A character is represented as a single character string. A string is very much like a C or Java string.
A number
is very much like a C or Java number, except that the octal and hexadecimal formats are not used.

Interactive tools

jqplay online

Online jq playground.

Jq playground

Jiq

Install Jiq

  1. Install jq
  2. Install Golang <1.7
  3. Export environment variables eg. to .bashrc or .profile files.
    export GOPATH=/home/ubuntu/.local/go
    export PATH="$PATH:$HOME/.local/go"
  1. Install jiq
    go get github.com/fiatjaf/jiq/cmd/jiq #required sudo in Ubuntu 18.04 WSL 
    export PATH=$PATH:~/go/bin            #location where go-golang Ubuntu package installs go-lang binaries

jmespath.terminal

This is jmespath.terminal git project without latest activity but still works.

jq

Turn JSON into CSV

aws ec2 describe-snapshots --owner-ids 777781152222 --region us-east-1 --profile prod > prod_snapshots.json
cat prod_snapshots.json \
  | jq -r '.Snapshots[] | [.Description,.Encrypted,.VolumeId,.State,.VolumeSize,.StartTime,.Progress,.OwnerId,.SnapshotId] | @csv' \
  > prod_snapshots.csv
# by using filter '.Snapshots[]' you select "Snapshots" array every object, then
# create an array with [.key1,.key2] as array is required data structure to turn into CSV
# by default double quote will be escaped ' \" ' using '-r' prevents doing it making easier to open in Excel.


Json blob

{
    "Snapshots": [
        {
            "Description": "Copied for DestinationAmi ami-85921111 from SourceAmi ami-e3045222 for SourceSnapshot snap-0ebc77af111b09222. Task created on 1,487,109,468,219.",
            "Encrypted": false,
            "VolumeId": "vol-ffffffff",
            "State": "completed",
            "VolumeSize": 8,
            "StartTime": "2017-02-14T21:57:52.000Z",
            "Progress": "100%",
            "OwnerId": "3335812595337",
            "SnapshotId": "snap-58862222"
        },
        {
            "Description": "Copied for DestinationAmi ami-ccc875a8 from SourceAmi ami-cfda333 for SourceSnapshot snap-71347333. Task created on 1,485,867,366,917.",
            "Encrypted": false,
            "VolumeId": "vol-ffffffff",
            "State": "completed",
            "VolumeSize": 60,
            "StartTime": "2017-01-31T12:56:11.000Z",
            "Progress": "100%",
            "OwnerId": "222557238111",
            "SnapshotId": "snap-2314333"
        }
    ]
}

Display only Keys that mach value

Extract AWS volumes

aws ec2 describe-volumes --region us-east-1 --profile prod > prod_volumes.json
cat prod_volumes.json \
  | cat ftr_volumes_all.json \
  | jq --raw-output '.Volumes[] | [ (.Tags[]? | select(.Key=="Name").Value, select(.Key=="Backup").Value), .VolumeId, .State, .SnapshotId, .Attachments[].InstanceId //"notAttached", .CreateTime, .Size ] | @csv' \
  > ftr_volumes_all.csv \
  > prod_volumes.csv
  • .Tags[]? means that the list/object is optional, prevents jq: error - Cannot iterate over null (null) error
  • (.Tags[]? | select(.Key=="Name").Value, select(.Key=="Backup").Value) taking expression in brackets groups multiple filters into one, so it is not affecting next filter. Remember , coma feeds date through next filter here it's .VolumeId. This produces first 2 items on in the list, see below in results.
  • .Attachments[].InstanceId //"notAttached" 2 forward slashes set default value


Blob and results
Blob Result
{
    "Volumes": [
        {
            "AvailabilityZone": "us-east-1b", 
            "Attachments": [], 
            "Tags": [
                {
                    "Value": "Comms Scheduler", 
                    "Key": "Name"
                }
            ], 
            "Encrypted": false, 
            "VolumeType": "gp2", 
            "VolumeId": "vol-1111cd2b12e622222", 
            "State": "available", 
            "Iops": 300, 
            "SnapshotId": "snap-1111810031fa22222", 
            "CreateTime": "2018-06-20T15:29:32.333Z", 
            "Size": 100
        }, 
        {
            "AvailabilityZone": "us-east-1b",
            "Attachments": [
                {
                    "AttachTime": "2018-06-27T11:13:30.000Z",
                    "InstanceId": "i-44441a50fc245555",
                    "VolumeId": "vol-77770ff240e67777",
                    "State": "attached",
                    "DeleteOnTermination": true,
                    "Device": "/dev/sda1"
                }
            ],
            "Tags": [
                {
                    "Value": "apache",
                    "Key": "Name"
                },
                {
                    "Value": "False",
                    "Key": "Backup"
                }
            ],
            "Encrypted": true,
            "VolumeType": "gp2",
            "VolumeId": "vol-123123ff240e62222",
            "State": "in-use",
            "KmsKeyId": "arn:aws:kms:us-west-1:0123481158111:key/40222222-4455-7777-8888-c06f27d59999",
            "SnapshotId": "snap-999947cb7448d8888",
            "Iops": 100,
            "CreateTime": "2018-06-27T11:13:30.582Z",
            "Size": 8
        }
    ]
}
[
  "Comms Scheduler",
  "backupFalse",
  "vol-1111cd2b12e622222",
  "available",
  "snap-1111810031fa22222",
  "notAttached",
  "2018-06-20T15:29:32.333Z",
  100
]
[
  "apache",
  "backupFalse",
  "False",
  "vol-123123ff240e62222",
  "in-use",
  "snap-999947cb7448d8888",
  "i-44441a50fc245555",
  "2018-06-27T11:13:30.582Z",
  8
]

Filtered list of instance name, type, virtualization type and instance id

At the beginning my simple convention for Json is [] is an array and {} is an object.

Get an JSON output of all instances in your AWS accounty

aws ec2 describe-instances --output json | jiq #pipe into jiq
#jiq's expressions line
Filter]> .Reservations[].Instances[] | {_name: .Tags[], _id: .InstanceId, _type: .InstanceType, _virtType: .Hypervisor }

.Reservations[].Instances[] returns all data that we need, therefore we pipe into new object { _key1: XXX, _key2: YYY }. Actually, the keys fields we define now, as: _name, _id, _id, _type.

Next, let's select only Tags we are interested in, from a list(object). Therefore, we pipe Tags via select function. Where the Tag name is Name and we want get .Value field.

{name: .Tags[] | select(.Key=="Name").Value

Full filter command:

[Filter]> .Reservations[].Instances[] | {name: .Tags[] | select(.Key=="Name").Value, id: .InstanceId, type: .InstanceType, virt_type: .Hypervisor }

{                              
  "_name": "box-2.prod1.example.com",
  "_id": "i-ecccc666",                                                          
  "_type": "c4.2xlarge",
  "_virt_type": "xen"   
}                            
{           
  "_name": "db-2.prod1.example.com",
  "_id": "i-0aaaaae989a68b6e6",                     
  "_type": "m4.large",                        
  "_virt_type": "xen"                          
}

Now, we want to display only records, that match our instance name using REGEX expression .*box.*

{_name: .Tags[] | select(.Key=="Name").Value | match(".*box.*")
[Filter]> .Reservations[].Instances[] | {_name: .Tags[] | select(.Key=="Name").Value | match(".*box.*") , _id: .InstanceId, _type: .InstanceType, _virt_type: .Hypervisor }

{                                                                                                                                                           
  "_name": {                                                                                                                                          
    "offset": 0,                                                                                                         
    "length": 25,                                                                                                                                  
    "string": "box-2.prod1.example.com",
    "captures": []       
  },                           
  "_id": "i-0c26072b64b6404dc",                            
  "_type": "m4.large",                                                          
  "_virt_type": "xen"   
}                      
{                            
  "_name": { 
    "offset": 0,                                         
    "length": 24,                                  
    "string": "db-2.prod1.example.com", 
    "captures": []                            
  },                         
  "_id": "i-0a6f8fe989a68b6e6",    
  "_type": "m4.large",                                           
  "_virt_type": "xen"                          
}

The output contains now extra info. Butm but we are only interested in string field

{_name: .Tags[] | select(.Key=="Name").Value | match(".*box.*").string

Let's sort out the output by _id field. Note, only arrays [] aka lists can be sorted using sort_by</function>. That's why we need to wrap our object {} into brackets [].

[Filter]> [.Reservations[].Instances[] | {_name: .Tags[] | select(.Key=="Name").Value | match(".*box.*").string, _id: .InstanceId, _type: .InstanceType, _virt_type: .Hypervisor }] | sort_by(._id)

Real use:

$ aws ec2 describe-instances --filters "Name=tag:Name,Values=*mft*" "Name=instance-state-code,Values=16" | jq -r '[.Reservations[].Instances[] | {_name: .Tags[] | select(.Key=="Name").Value | match(".*box.*").string, _id: .InstanceId, _type: .InstanceType, _virt_type: .Hypervisor }] | sort_by(._id)'

Awscli queries

$ aws ec2 describe-instances \
  --output table \
  --query 'Reservations[].Instances[].[Tags[?Key==`Name`] | [0].Value,InstanceId,InstanceType]'

And we get a nicely formatted table:
-----------------------------------------------
|              DescribeInstances              |
+----------------+--------------+-------------+
|  xxxxxxxxxx    |  i-a0169xxx  |  r3.large   |
|  yyyyyyyyyy    |  i-11a46xxx  |  m3.large   |
|  zzzzzzzzzzzzzz|  i-07c4axxx  |  t2.medium  |
+----------------+--------------+-------------+

References