Thursday, 24 November 2016

Using jq to read json output (from AWS CLI commands)

We want to write a script to update AWS Route53 records as a server instantiates.



We're using "split horizon" DNS - that is to say, DNS where we have 2 zones ; an internal, and an external.  The internal one allows servers in the private IP space to resolve server names to internal addresses, and the external one allows the rest of the world to resolve (some of) our servers.]

We need to be able to update the internal DNS.  The AWS CLI command has this :

aws route53 list-hosted-zones(-by-name) - you can omit the -by-name if you want.   This reports back in either text or JSON information about zones.  The text format is slightly incomplete ( we believe), and splits output across 2 lines.  This isn't too hard to handle with (say) sed, but the data seems incomplete. To get all the information regarding hosted zones, you need the json output.  This, however, is not amenable to parsing with the traditional Unix tools (sed & awk, etc).  We need jq.


Note:

All examples used jq 1.5: 
$ jq --version

jq-1.5-1-a5b5cbe




An example:

aws route53 list-hosted-zones  | jq '."HostedZones"[] | select(.Config.PrivateZone==true )| select(.Name=="xxxxxxx-poc.uk.")'


jq works with filters, which you connect together with pipes - just like at the Unix shell, so the ".HostedZones[]" creates an array(list) (called HostedZones); this is piped into a select statement which filters so we only have entries in the list where the HostedZone/Config/PrivateZone has the boolean value true.  This output is piped into a further select which matches those zones named "xxxxxxx-poc.uk."  NOTE: there is a dot at the end of the zone name.

If we want to see just the Id of the selected zone, add .Id at the end:


$ aws route53 list-hosted-zones  | jq '."HostedZones"[] | select(.Config.PrivateZone==true )| select(.Name=="sainsburys-poc.uk.").Id'
"/hostedzone/Z1E7TO5JG0M9ZC"
This can then be piped to (eg) sed:

| sed 's#^.*/##'  (which uses a greedy regex (the .*) to match everything up to the final / in the line , and replace it with nothing.

You could be clever:
|sed 's#^.*/\([^\"]*\).*$#\1#'
to get rid of the " (by tagging a regex which matches everything up to the quote.

or simply | tr -d'"' to delete the quote using tr.

Then, there's passing variables in to jq.
Like so:

jq --arg jq_varname "$shell_variable"

Which is then used in jq like so:

jq --arg t_name $tag_name ' .Tags | select(.Tag==$t_name).Value' (untested) 

for example.  All very natural - except that there's no equals between the variable and it's assigned value in the command-line.


Wildcards

Sometimes we want to return records which match a regex.  For this we probably want a contains() call - eg:

aws ec2 describe--instances | jq '.Images | select(.Name!=null) |select(.VirtualizationType=="hvm") | select(.Name | contains("entos"))'

---- More needed here - that's no wildcard!


Returning Multiple Values

I want to return multiple values from a list (one of which needs to be converted to a string so I can catenate it to the end of the other):

curl -X GET -ugrahamnicholls:xxx -k https://foreman.test-poc.uk/api/architectures/1 | jq '.operatingsystems[] | [ .name, .id|tostring ] | join(": ")'

  Various Example Queries:


Show the Centos compute resource:

$ curl -X GET -ugrahamnicholls:xxxxx -k https://foreman.sainsburys-poc.uk/api/operatingsystems/ 2>/dev/null | jq '.results[] | select(.name == "CentOS")'

Startswith example:


$ aws ec2 describe-images --output=json --owners=744647245289 --region=eu-west-1 | jq -M '.Images[] | select(.Name | startswith("CentOS")).Name'