Podchaser Logo
Home
HPR4114: Introduction to jq - part 2

HPR4114: Introduction to jq - part 2

Released Thursday, 9th May 2024
Good episode? Give it some love!
HPR4114: Introduction to jq - part 2

HPR4114: Introduction to jq - part 2

HPR4114: Introduction to jq - part 2

HPR4114: Introduction to jq - part 2

Thursday, 9th May 2024
Good episode? Give it some love!
Rate Episode

Overview

In the lastepisode we looked at how JSON data is structured and saw howjq could be used to format and print this type of data.

In this episode we'll visit a few of the options to thejq command and then start on the filters written in thejq language.

Options used by jq

In general the jq command is invoked thus:

jq [options...] filter [files...]

It can be given data in files or sent to it via the STDIN (standardin) channel. We saw data being sent this way in the last episode, havingbeen downloaded by curl.

There are many options to the command, and these are listed in themanual page and in the online manual. We willlook at a few of them here:

--help or -h

Output the jq help and exit with zero.

-f filename or--from-file filename

Read filter from the file rather than from a command line, like awk´s-f option. You can also use ´#´ to make comments in the file.

--compact-output or -c

By default, jq pretty-prints JSON output. Using thisoption will result in more compact output by instead putting each JSONobject on a single line.

--color-output or -C and--monochrome-output or -M

By default, jq outputs colored JSON if writing to aterminal. You can force it to produce color even if writing to a pipe ora file using -C, and disable color with-M.

--tab

Use a tab for each indentation level instead of two spaces.

--indent n

Use the given number of spaces (no more than 7) for indentation.

Notes

  • The -C option is useful when printing output to theless command with the colours that jq normallygenerates. Use this:

    jq -C '.' file.json | less -R

    The -R option to less allows colour escape sequences topass through.

  • Do not do what I did recently. Accidentally leaving the-C option on the command caused formatted.jsonto contain all the escape codes used to colour the output:

    $ jq -C '.' file.json > formatted.json

    This is why jq normally only generates coloured outputwhen writing to the terminal.

Filters in jq

As we saw in the last episode JSON can contain arrays and objects.Arrays are enclosed in square brackets and their elements can be any ofthe data types we saw last time. So, arrays of arrays, arrays ofobjects, and arrays of both of these are all possible.

Objects contain collections of keyed items where the keys are stringsof various types and the values they are associated with can be any ofthe data types.

JSON Examples

Simple arrays:

[1,2,3][1,2,3,[4,5,6]]["Hacker","Public","Radio"]["Sunday","Monday","Tuesday","Wednesday","Thursday","Friday","Saturday"]

Simple object:

{ "name": "Hacker Public Radio", "type": "podcast"}

This more complex object was generated by the Random User GeneratorAPI. It is a subset of what can be obtained from this site.

{"gender": "female","name": {"title": "Mrs","first": "Jenny","last": "Silva"},"dob": {"date": "1950-01-03T21:38:19.583Z","age": 74},"nat": "GB"}

This one comes from the file countries.json from the Github projectmledoze/countries. It is a subset of the entry forMexico.

{"name": {"common": "Mexico","official": "United Mexican States","native": {"spa": {"official": "Estados Unidos Mexicanos","common": "México"}}},"capital": ["Mexico City"],"borders": ["BLZ","GTM","USA"]}

Identity filter

This is the simplest filter which we already encountered in episode1: '.'. It takes its input and produces the same value asoutput. Since the default action is to pretty-print the output itformats the data:

$ echo '["Hacker","Public","Radio"]' | jq .["Hacker","Public","Radio"]

Notice that the filter is not enclosed in quotes in this example.This is usually fine for the simplest filters which don't contain anycharacters which are of significance to the shell. It's probably a goodidea to always use (single) quotes however.

There may be considerations regarding how jq handlesnumbers. Consult the jqdocumentation for details.

Object Identifier-Indexfilter

This form of filter refers to object keys. A particular key isusually referenced with a full-stop followed by the name of the key.

In the HPR statistics data there is a top-level key "hosts" whichrefers to the number of currently registered hosts. This can be obtainedthus (assuming the JSON is in the file stats.json):

$ jq '.hosts' stats.json357

The statistics file contains a key 'stats_generated'which marks a Unix time value (seconds since the Unix Epoch 1970-01-01).This can be decoded on the command line like this:

$ date -d "@$(jq '.stats_generated' stats.json)" +'%F %T'2024-04-18 15:30:07

Here the '-d' option to date provides thedate to print, and if it begins with a '@' character it'sinterpreted as seconds since the Epoch. Note that the result is in mylocal time zone which is currently UTC + 0100 (aka BST).

Using object keys in this way only works if the keys contain onlyASCII characters and underscores and don't start with a digit. To useother characters it's necessary to enclose the key in double quotes orsquare brackets and double quotes. So, assuming the key we used earlierhad been altered to 'stats-generated' we could use eitherof these expressions:

."stats-generated".["stats-generated"]

Of course, the .[<string>] form is valid in allcontexts. Here <string> represents a JSON string indouble quotes. The jq documentation refers to this as anObject Index.

What if you want the next_free value discussed in thelast episode (number of shows until the next free slot)? Just typing thefollowing will not work:

$ jq '.next_free' stats.jsonnull

This is showing that there is no key next_free at thetop level of the object, the key we want is in the object with the keyslot.

If you request the slot key this will happen:

$ jq '.slot' stats.json{"next_free": 8,"no_media": 0}

Here an object has been returned, but we actually want the valuewithin it, as we know.

This is where we can chain filters like this:

$ jq '.slot | .next_free' stats.json8

The pipe symbol causes the result of the first filter to bepassed to the second filter. Note that the pipe here is not the same asthe Unix pipe, although it looks the same

There is a shorthand way of doing this "chaining":

$ jq '.slot.next_free' stats.json8

This is a bit like a file system path, and makes the extraction ofdesired data easier to visualise and therefore quite straightforward, Ithink.

Array index filter

We have seen the object index filter .[<string>]where <string> represents a key in the object we areworking with.

It makes sense for array indexing to be.[<number>] where <number>represents an integer starting at zero, or a negative integer. Themeaning of the negative number is to count backwards from the lastelement of the array (which is -1).

So, some examples might be:

$ echo '["Sunday","Monday","Tuesday","Wednesday","Thursday","Friday","Saturday"]' | jq '.[1]'"Monday"

$ echo '["Sun","Mon","Tue","Wed","Thu","Fri","Sat"]' | jq '.[-1]'"Sat"

$ echo '[1, 2, 3, [4, 5, 6]]' | jq '.[-1]'[4,5,6]

We will look at more of the basic filters in the next episode.

Links

Show More
Rate

Join Podchaser to...

  • Rate podcasts and episodes
  • Follow podcasts and creators
  • Create podcast and episode lists
  • & much more

Episode Tags

Do you host or manage this podcast?
Claim and edit this page to your liking.
,

Unlock more with Podchaser Pro

  • Audience Insights
  • Contact Information
  • Demographics
  • Charts
  • Sponsor History
  • and More!
Pro Features