Working with JSON in bash using jq

jq is a powerful tool that lets you read, filter, and write JSON in bash

· 5 min read · Last updated:
The logos for bash and JSON next to each other

Want the TL:DR? See the jq cheatsheet

Perhaps you’ve seen or even written bash that looks like this:

curl -s 'http://api.icndb.com/jokes/random' \
| python -m json.tool \
| grep '\"joke\"' \
| cut -d ':' -f 2 \
| sed 's/"/\"/g'

(Note: the above code was taken from https://hackernoon.com/a-crude-imessage-api-efed29598e61, which is a great article).

That’s tough to read and even tougher to write. You have to pipe to 4 different utilities just to get to a property in the JSON response body! Bash doesn’t understand JSON out of the box, and using the typical text manipulation tools like grep, sed, or awk, gets difficult. Luckily there’s a better way using a tool called jq.

jq can simplify the above bash to this:

curl -s "http://api.icndb.com/jokes/random" | jq '.value.joke'

That’s much nicer 😎. By making JSON easy to work with in bash, jq opens up a lot of automation possibilities that otherwise required me to write something in node.js (which isn’t bad, it just takes longer generally).

Why not just use node.js when you need to deal with JSON?

Sometimes node.js is the right tool. For most automation tasks, I like to use bash whenever possible because it’s faster and even more portable (I can share a bash script with team members that don’t have node.js installed). To me, bash is more expressive and succinct for certain tasks than node is.

Install jq

jq isn’t a built-in command in any environment, so you have to install it. Run brew install jq on macOS. See jq’s install help page for how to install on other environments.

Basics of jq

jq works similarly to sed or awk — like a filter that you pipe to and extract values from. Also like sed or awk, it basically has it’s own domain specific language (DSL) for querying JSON. Luckily, it’s really intuitive (unlike awk 😜).

Get a property

Let’s say we have JSON that looks like this:

{ "foo": 123, "bar": 456 }

To print out the foo property, we use the . operator followed by the property name.

echo '{ "foo": 123, "bar": 456 }' | jq '.foo'

That will print out 123 , as expected.

This works with nesting too. So .a.b.c.d will traverse down nested objects’ properties.

This, all by itself, is pretty useful. For a realistic and totally useful example, let’s write a script that gets the Astronomy Picture of the Day and sets it as our wallpaper (this is macOS only).

Yay! All this astronomy stuff makes it feel like the right time for a Neil deGrasse Tyson gif.

via GIPHY

Note that if a property has a spaces or weird characters in it, you’ll have to use quotes. For example:

echo '{ "Version Number": "1.2.3" }' | jq '."Version Number"'

Also, be sure to always wrap your jq selector in a single-quotes, otherwise bash tries to interpret all the symbols like ., whereas we want jq to do that.

Iteration

Now let’s see how iteration works. The array or object value iterator operator, .[] , is what makes it possible.

Here’s a really basic example:

echo '[1,2,3]' | jq '.[]'

That will output 1, 2, 3 on separate lines.

In an array of objects, you can access a property on each item in the array like so:

echo '[ {"id": 1}, {"id": 2} ]' | jq '.[].id'

Or on an object, .[] will output the value of each key/value pair:

echo '{ "a": 1, "b": 2 }' | jq '.[]'

So that will return 1 2.

Note that you can also pass an index to .[], so

echo '["foo", "bar"]' | jq '.[1]'

will return just bar.

Now how do we do something for each line? In the same way you’d handle anything that outputs multiple lines of information in bash: xargs , for loops, or some commands just handle multiple input items, etc. For example:

echo '["foo", "bar"]' | jq '.[]' | touch

jq Functions

jq also has built-in “functions”. Returning to the previous object iteration example — let’s say we wanted get the keys of the object (not the values) as an array:

echo '{ "a": 1, "b": 2 }' | jq 'keys | .[]'

which will return a b . Note that we’re also using the pipe | operator, which works the same in jq as it does in bash — it takes the output from the left and passes it as input to the right.

Another handy function for arrays and objects is the length function, which returns the array’s length property or the number of properties on an object.

echo '[1,2,3]' | jq 'length'

You can get even fancier and create intermediary arrays and objects on the fly. Here, I’m combining the keys of the dependencies and devDependencies objects (from a package.json file) into an array, flattening it, and then getting the length.

jq -r '[(.dependencies, .devDependencies) | keys] | flatten | length' package.json

That returns the number of dependencies and devDependencies a package.json contains.

Creating objects

You can also create objects on the fly. This can be useful for re-shaping some JSON. For example:

echo '{"user": {"id": 1, "name": "Cameron"}}' | jq '{ name: .user.name }'
# { "name": "Cameron" }

Let’s use it for real now

What if I wanted to audit my package.json dependencies and remove anything that’s not being used? Unused dependencies slow down npm installs for everyone and is just messy. I could manually grep usages of each dependency (via grep or in my IDE), but if you have a lot of dependencies that gets tedious fast, so let’s automate it.

[1] Here’s how the grep flags work:

  • –include and –exclude-dir narrow the files that get searched

  • -R means recursive, tells it to grep all matching files

  • –color colorizes the output

  • -n displays line numbers

[2] I have to export it so that a subshell can see it. If you want xargs to call a custom function, you have to call it in a subshell for some reason

[3] -r is for “raw-output”, so no quotes around values, which makes it suitable for processing in other bash commands. We get the dependency names as an array (this is equivalent to Object.keys(require(‘./package.json’).dependencies) in node.js)

[4] Then we pipe that to xargs which handles setting up a grep for each lines. Here’s how the xargs flags all work:

  • -t tells it to echo the constructed command; useful for debugging

  • -I {} defines the replacement string where the dependency string will get placed

  • -P 4 defines the concurrency, so 4 concurrent greps

  • we tell it to start a bash subshell where our grep_dep function is called with it’s args

There’s more that could be done to the grep-ing in that script to make it more robust, but that’s the basic gist.

I used something similar to this recently at work to prune unused dependencies. We have a huge front-end monolith with a single package.json that has 250 dependencies 🙀, so some automated assistance was necessary.

jq is awesome and makes working with JSON in bash easy. The DSL for filtering, querying, and creating JSON goes much deeper than what I’ve covered here, so see https://stedolan.github.io/jq/manual/ for the full documentation.