JSON Manipulation Using Linux jq Command

jq is a lightweight command-line JSON processor that helps you filter, parse, and manipulate these JSON responses when working with RESTful APIs.

jq is more than just a simple parser. It’s a powerful tool that transforms, reshapes, and queries JSON data.

This tutorial will dive deep into the capabilities of jq command in JSON processing.

 

 

Installing jq

Let’s walk you through the installation of jq on popular Linux distributions, macOS, and Windows.

For Debian-based systems (like Ubuntu)

sudo apt update && sudo apt install -y jq

By running this command, you’re updating the package lists for upgrades and new package installations. Then, you install jq using the package manager.

For Red Hat-based distributions (like CentOS)

sudo yum install jq

This command utilizes the YUM package manager, which is default for Red Hat-based systems, to install jq.

For macOS (using Homebrew)

If you don’t have Homebrew installed, you can get it from https://brew.sh/.

brew install jq

By executing the above command, you install jq with ease.

 

Basic Syntax

jq operates as a filter, taking input and producing output based on the filter applied. Here’s a breakdown of the basic syntax:

echo '[JSON INPUT]' | jq '[FILTER]'

In this model, the JSON data (either from a file, an API response, or a direct echo command) is piped into jq, and the filter modifies or queries the JSON.

 

Reading JSON

Sometimes, you just need to visualize the entire JSON structure, especially when it’s compressed or in a single line.

echo '{"users":[{"name":"Alice","age":29},{"name":"Bob","age":34}]}' | jq .

Output:

{
  "users": [
    {
      "name": "Alice",
      "age": 29
    },
    {
      "name": "Bob",
      "age": 34
    }
  ]
}

Simply using the . filter will output the entire JSON, formatted for readability.

Get specific key

You can specify the name after the dot to extract a specific key or value.

Consider the following JSON:

{
  "id": 101,
  "title": "System Admin"
}

Command:

echo 'JSON' | jq '.title'

Output:

"System Admin"

This extracts and displays the value associated with the title key.

Print nested key

Navigating nested JSON structures requires a path-like syntax.

Consider the following JSON:

{
  "employee": {
    "name": "Eve",
    "details": {
      "position": "Developer",
      "years_of_experience": 3
    }
  }
}

Command:

echo 'JSON' | jq '.employee.details.position'

Output:

"Developer"

By following the nested path, .employee.details.position, you can extract deeply nested keys with precision.

 

jq Filtration

One of jq’s strengths is its powerful filtration capabilities. Let’s explore some advanced filtering techniques:

Indexing

Arrays in JSON are zero-indexed. That means the first element is accessed with [0], the second with [1], and so on.

Consider the following JSON:

{
  "colors": ["red", "blue", "green", "yellow"]
}

Command:

echo 'JSON' | jq '.colors[1]'

Output:

"blue"

Slicing

Slicing allows you to extract a portion of an array, based on indices.

echo 'JSON' | jq '.colors[1:3]'

Output:

[ "blue",  "green"]

The slice [1:3] grabs elements starting from index 1 and goes up to (but doesn’t include) index 3.

Array iteration

To apply a filter or transformation to each element in an array, use [].

echo 'JSON' | jq '.colors[]'

Output:

"red"
"blue"
"green"
"yellow"

This iterates over the array, printing each item on a new line.

Filter with multiple conditions

You can combine filters to make more specific queries.

Consider the following JSON:

{
  "employees": [
    {"name": "Adam", "role": "developer", "years": 2},
    {"name": "Beth", "role": "designer", "years": 5},
    {"name": "Carl", "role": "developer", "years": 4}
  ]
}

Command:

echo 'JSON' | jq '.employees[] | select(.role == "developer" and .years > 3)'

Output:

{
  "name": "Carl",
  "role": "developer",
  "years": 4
}

Here, you’re selecting developers with more than 3 years of experience.

Negative selection

You can exclude specific items by negating selection using != .

echo 'JSON' | jq '.colors[] | select(. != "green")'

Output:

"red"
"blue"
"yellow"

By using the negative selection, you’ve excluded “green” from the output.

 

Filtering Using select, map, and reduce

select: It chooses data based on a boolean condition.

Consider the following JSON:

{
  "students": [{"name": "Alex", "grade": 85}, {"name": "Brian", "grade": 75}, {"name": "Carla", "grade": 90}]
}

Command to select students with grades above 80:

echo 'JSON' | jq '.students[] | select(.grade > 80)'

Output:

{
  "name": "Alex",
  "grade": 85
}
{
  "name": "Carla",
  "grade": 90
}

Select Multiple Fields

You can select multiple fields using the select function.

Consider the following JSON:

{
  "name": "Alice",
  "age": 30,
  "address": {
    "city": "New York",
    "state": "NY"
  },
  "email": "alice@email.com"
}

Command:

echo 'JSON' | jq '{name, email}'

Output:

{
  "name": "Alice",
  "email": "alice@email.com"
}

map: This function applies a filter to each item in an array.

Command to get an array of student names:

echo 'JSON' | jq '.students | map(.name)'

Output:

[  "Alex",  "Brian",  "Carla"]

reduce: Reduce transforms an array into a single value by iterating over its items.

Given an array of numbers:

{
  "values": [10, 20, 30, 40]
}

Command to sum the values:

echo 'JSON' | jq 'reduce .values[] as $item (0; . + $item)'

Output:

100

Chaining filters and combinations

Suppose you have the JSON:

{
  "projects": [
    {"name": "Project A", "tasks": ["task1", "task2"]},
    {"name": "Project B", "tasks": ["task3", "task4", "task5"]}
  ]
}

To get all tasks across all projects:

echo 'JSON' | jq '.projects[].tasks[]'

Output:

"task1"
"task2"
"task3"
"task4"
"task5"

By combining different filters and operators, you can extract, transform, and manipulate JSON data.

 

Sort By Key

You can use sort_by function to sort by key.

If you have JSON data like this:

{
  "zeta": "first value",
  "alpha": "second value",
  "beta": "third value"
}

To sort the keys at the top level:

echo 'JSON' | jq 'to_entries | sort_by(.key) | from_entries'

Output:

{
  "alpha": "second value",
  "beta": "third value",
  "zeta": "first value"
}

 

jq Paths

In jq, paths provide a way to navigate through JSON structures, allowing you to locate and manipulate specific parts of the data.

Output paths

Instead of outputting values, sometimes you want to output their paths within the JSON structure.

This can be helpful when diagnosing or exploring large JSON documents.

Consider the following JSON:

{
  "departments": {
    "IT": ["Alice", "Bob"],
    "HR": ["Carol", "Dave"]
  }
}

Command to output paths:

echo 'JSON' | jq 'path(..)'

Output:

[]
["departments"]
["departments", "IT"]
["departments", "IT", 0]
["departments", "IT", 1]
["departments", "HR"]
["departments", "HR", 0]
["departments", "HR", 1]

Here, each array represents a path in the JSON structure. The .. syntax is a recursive descent, which means it outputs values at all levels in the JSON.

Select using paths

You can use paths to select specific parts of the JSON.

Using the same JSON as above, suppose you want to access Bob from the IT department.

echo 'JSON' | jq '.departments.IT[1]'

Output:

"Bob"

But paths become even more powerful when combined with dynamic operations.

Command using a stored path:

echo 'JSON' | jq 'getpath(["departments", "IT", 1])'

Output:

"Bob"

Here, getpath fetches the value at the specified path.

 

Modify values

You can use jq to modify values.

Here’s how you can update, replace, or add new values to your JSON structures:

Update existing values

For this example, consider the following JSON:

{
  "team": {
    "leader": "Alex",
    "members": ["Ben", "Chris", "Diana"]
  }
}

If you need to change the team leader’s name from Alex to Alice:

echo 'JSON' | jq '.team.leader = "Alice"'

Output:

{
  "team": {
    "leader": "Alice",
    "members": ["Ben", "Chris", "Diana"]
  }
}

Add new key-value pairs

If you want to add an email field.

echo 'JSON' | jq '.user.email = "eve@example.com"'

Output:

{
  "user": {
    "name": "Eve",
    "score": 100,
    "email": "eve@example.com"
  }
}

Deleting fields

You can use del to remove the score field from the user’s profile:

echo 'JSON' | jq 'del(.user.score)'

Output:

{
  "user": {
    "name": "Eve",
    "email": "eve@example.com"
  }
}

 

Arithmetic Operations

jq can perform arithmetic operations on numeric values within JSON structures.

For the following JSON:

{
  "dimensions": {
    "length": 5,
    "width": 3
  }
}

To calculate the area (length x width):

echo 'JSON' | jq '.dimensions.length * .dimensions.width'

Output:

15

Increment and decrement

Consider the following JSON:

{
  "user": {
    "name": "Eve",
    "score": 50
  }
}

To increment the score by 5:

echo 'JSON' | jq '.user.score += 5'

Output:

{
  "user": {
    "name": "Eve",
    "score": 55
  }
}

Another example for percentage calculations:

{
  "product": {
    "name": "Widget",
    "price": 100,
    "discount": 10
  }
}

To calculate the price after discount:

echo 'JSON' | jq '.product.price * (1 - (.product.discount / 100))'

Output:

90

Complex expressions

For a JSON like this:

{
  "circle": {
    "radius": 7
  }
}

To compute the area of the circle (πr^2):

echo 'JSON' | jq '.circle.radius^2 * 3.14159265'

Output:

153.93804045

 

String Manipulations

With string manipulations, you can modify, combine, and even interrogate string data.

Replace

You can use the gsub function to globally replace all occurrences of a substring.

If you have a JSON like:

{
  "message": "Hello, World!"
}

And you want to replace “World” with “jq”.

echo 'JSON' | jq '.message |= gsub("World"; "jq")'

Output:

{
  "message": "Hello, jq!"
}

Match and Capture

The match function lets you use regular expressions.

Consider a string containing a date:

{
  "event": "The concert is on 2023-09-30."
}

To capture the date:

echo 'JSON' | jq '.event | match("\\d{4}-\\d{2}-\\d{2}").string'

Output:

"2023-09-30"

Here, the regex captures a date in the format YYYY-MM-DD.

String concatenation

Given the JSON:

{
  "person": {
    "firstName": "John",
    "lastName": "Doe"
  }
}

To concatenate the first name and last name:

echo 'JSON' | jq '.person | "\(.firstName) \(.lastName)"'

Output:

"John Doe"

In the above example, we used string interpolation syntax \(variable) inside the double-quoted string for concatenation.

 

Testing and Comparisons

jq allows you to perform various tests on JSON data.

Given a JSON:

{
  "product": {
    "price": 50,
    "isAvailable": true
  }
}

To check if the product is available:

echo 'JSON' | jq '.product.isAvailable == true'

Output:

true

If you’re working with numeric comparisons, you can do things like:

echo 'JSON' | jq '.product.price > 40'

Output:

true

This command checks if the price is greater than 40.

Check if Key Exists

Sometimes you need to verify the presence of a key or value.

For a JSON:

{
  "employee": {
    "name": "Alice",
    "designation": "Developer"
  }
}

You can use the has function to check for existence:

echo 'JSON' | jq 'has("designation")'

Output:

false

The result is false because the has function checks at the root level of the JSON.

To check within the “employee” object:

echo 'JSON' | jq '.employee | has("designation")'

Output:

true

Similarly, to check for the existence of a specific value, you can use the contains function.

For example, to see if the “employee” object contains the value “Developer”:

echo 'JSON' | jq '.employee | contains({"designation": "Developer"})'

Output:

true

 

Conditional Operations

jq provides the if-else construct that’s similar to many programming languages.

Basic If-Else

Given a JSON containing a score:

{
  "student": {
    "name": "Mark",
    "score": 85
  }
}

To classify the student based on the score:

echo 'JSON' | jq '.student | if .score >= 90 then "Excellent" elif .score >= 75 then "Good" else "Average" end'

Output:

"Good"

Here, since Mark’s score is 85, he falls into the “Good” category.

Using the same student data, you might want to add a new field called classification based on the score.

echo 'JSON' | jq '.student.classification = (if .student.score >= 90 then "Excellent" elif .student.score >= 75 then "Good" else "Average" end)'

Output:

{
  "student": {
    "name": "Mark",
    "score": 85,
    "classification": "Good"
  }
}

Nested If-Else

For more complex decision-making, you can nest if-else statements.

Consider the JSON:

{
  "employee": {
    "position": "Manager",
    "yearsOfExperience": 3
  }
}

To determine a bonus based on position and experience:

echo 'JSON' | jq '.employee | if .position == "Manager" then (if .yearsOfExperience > 5 then "5000" elif .yearsOfExperience > 2 then "3000" else "1000" end) else "0" end'

Output:

"3000"

Since the employee is a Manager with 3 years of experience, the bonus is determined as 3000.

 

Working with Numbers

jq offers several built-in functions to handle numbers, making it a versatile tool for mathematical tasks on JSON data.

Rounding

You can use the round function to round numbers to a specific decimal place.

For a JSON:

{
  "measurement": 12.5678
}

To round to two decimal places:

echo 'JSON' | jq '.measurement | round'

Output:

13

Minimum and Maximum

You can use max_by and min_by functions to find the maximum and minimum numbers respectively.

{
  "temperatures": [72.5, 68.9, 75.2, 69.8]
}

To find the maximum temperature:

echo 'JSON' | jq 'max_by(.temperatures[])'

Output:

75.2

Similarly, for the minimum:

echo 'JSON' | jq 'min_by(.temperatures[])'

Output:

68.9

Math Functions

jq offers various mathematical functions that can be applied directly to numerical values.

Using the same measurement value:

Square Root

echo 'JSON' | jq '.measurement | sqrt'

Output:

3.54400902933387

Floor (rounding down)

echo 'JSON' | jq '.measurement | floor'

Output:

12

Ceil (rounding up)

echo 'JSON' | jq '.measurement | ceil'

Output:

13

 

Date and Time Formatting

jq has a function called strflocaltime for date and time formatting.

Given a UNIX timestamp in a JSON:

{
  "event": {
    "timestamp": 1695993600
  }
}

You can format this timestamp to a readable date:

echo 'JSON' | jq '.event.timestamp | strftime("%Y-%m-%d")'

Output:

"2023-09-01"

In the above command, %Y-%m-%d is a format string where %Y stands for the full year, %m for the month, and %d for the day of the month.

For more detail, let’s format the timestamp to show the date, time, and even the day of the week.

echo 'JSON' | jq '.event.timestamp | strftime("%A, %d %B %Y %H:%M:%S")'

Output:

"Saturday, 01 September 2023 00:00:00"

In this format string:

  • %A stands for the full weekday name.
  • %d is the day of the month.
  • %B is the full month name.
  • %Y is the full year.
  • %H, %M, and %S represent hours, minutes, and seconds respectively.

Converting a Formatted Date to a Timestamp

Sometimes you might need to perform the inverse operation, converting a readable date string back into a timestamp.

{
  "date": "2023-09-01"
}

Command:

echo 'JSON' | jq 'strptime("%Y-%m-%d"; .date) | mktime'

Output:

1695993600

Here, strptime parses the date string based on the given format, and mktime converts it back to a UNIX timestamp.

 

Flatten a Structure

The jq command provides the flatten function for flattening nested structures.

Consider the following JSON structure with nested arrays:

{
  "groups": [
    ["Alice", "Bob"],
    ["Charlie", "David"],
    ["Eve", "Frank"]
  ]
}

To flatten the nested arrays into a single array:

echo 'JSON' | jq '.groups | flatten'

Output:

[
  "Alice",
  "Bob",
  "Charlie",
  "David",
  "Eve",
  "Frank"
]

With the use of the flatten function, the inner arrays are combined into a single array, removing the nested structure.
Even if the nesting is multiple levels deep, the flatten function will reduce it to a single array level.

Nested Iterations

Let’s use a JSON structure that contains nested arrays:

{
  "departments": [
    {
      "name": "IT",
      "employees": ["Alice", "Bob"]
    },
    {
      "name": "HR",
      "employees": ["Charlie", "David"]
    }
  ]
}

Now, suppose you want to list each employee with their department:

echo 'JSON' | jq '.departments[] | .name as $dept | .employees[] | "\($dept): \(.)"'

Output:

"IT: Alice"
"IT: Bob"
"HR: Charlie"
"HR: David"

Here’s a breakdown:

  • .departments[] iterates over each department.
  • .name as $dept stores the department name in the $dept variable.
  • .employees[] iterates over the nested employee arrays.
  • The final "\($dept): \(.)" produces the desired output by combining the department and employee names.

Iterating over Nested Objects

Consider a structure with nested objects:

{
  "school": {
    "grade1": {
      "sectionA": ["Alice", "Bob"],
      "sectionB": ["Charlie"]
    },
    "grade2": {
      "sectionA": ["David", "Eve"]
    }
  }
}

To list each student with their grade and section:

echo 'JSON' | jq '.school | to_entries[] | .key as $grade | .value | to_entries[] | .key as $section | .value[] | "\($grade)-\($section): \(.)"'

Output:

"grade1-sectionA: Alice"
"grade1-sectionA: Bob"
"grade1-sectionB: Charlie"
"grade2-sectionA: David"
"grade2-sectionA: Eve"

Explanation:

  • to_entries[] converts each object to an array of key-value pairs.
  • .key as $grade and .key as $section store the grade and section names in variables.
  • Further nested iterations continue using the .value[] construct.
  • The final output is combined using string interpolation.

 

Join Two Arrays

Consider two simple arrays within a JSON structure:

{
  "fruits": ["apple", "banana", "cherry"],
  "vegetables": ["carrot", "broccoli"]
}

If you want to concatenate these two arrays into a single array:

echo 'JSON' | jq '.fruits + .vegetables'

Output:

[
  "apple",
  "banana",
  "cherry",
  "carrot",
  "broccoli"
]

Here, the + operator is used to concatenate the two arrays together.

Combining Elements from Two Arrays

Let’s use another example with two arrays of equal length:

{
  "names": ["Alice", "Bob", "Charlie"],
  "scores": [85, 90, 78]
}

Suppose you want to combine each name with its corresponding score:

echo 'JSON' | jq 'range(0; .names|length) | "\(.names[.]) scored \(.scores[.])"'

Output:

"Alice scored 85"
"Bob scored 90"
"Charlie scored 78"

Here’s a breakdown:

  • range(0; .names|length) generates an array of indices from 0 up to the length of the names array.
  • For each index, the expression accesses corresponding elements from both names and scores arrays.
  • The result is a string representation combining the name and score.

 

Check Types

The type function returns the type of the data.

Given a JSON:

{
  "name": "Alice",
  "age": 25,
  "is_student": false
}

To verify the type of age:

echo 'JSON' | jq '.age | type'

Output:

"number"

 

Casting Types

Sometimes, data is not in the required type, and you’d need to convert or cast it.

String to Number

If you have a JSON:

{
  "value": "123"
}

To convert value to a number:

echo 'JSON' | jq '.value | tonumber'

Output:

123

Number to String

{
  "age": 25
}

Convert age to a string:

echo 'JSON' | jq '.age | tostring'

Output:

"25"

Boolean to String

{
  "is_active": true
}

Convert is_active to a string:

echo 'JSON' | jq '.is_active | tostring'

Output:

"true"

 

Accessing Environment Variables

To pass environment variables to jq, you can use the --arg option. This allows you to assign an environment variable’s value to a jq variable, which can then be used in your jq expression.

For instance, let’s say you have an environment variable named USERNAME, and you want to filter a JSON object based on this variable’s value.

USERNAME="Alice"
echo '{"users": ["Alice", "Bob", "Charlie"]}' | jq --arg user $USERNAME '.users[] | select(. == $user)'

Output:

"Alice"

Here’s what happens:

  • --arg user $USERNAME passes the value of USERNAME to the jq variable $user.
  • The select function filters the array, looking for a match with $user.

Setting Default Values

In some cases, you might want to set default values if an environment variable is not set:

USERNAME=${USERNAME:-"DefaultUser"}
echo '{"users": ["DefaultUser", "Alice"]}' | jq --arg user $USERNAME '.users[] | select(. == $user)'

“DefaultUser” will be used if USERNAME isn’t set

 

Creating Custom jq Functions

Here’s the basic structure of a custom function in jq:

def function_name(parameters):
  function_body;

Let’s create a function that multiplies a number by 2:

echo '5' | jq 'def double(x): x * 2; double(.)'

Output:

10

In this example:

  • The custom function double(x) multiplies the input x by 2.
  • We then use this function to double the number 5.

Advanced Example with Filter

Suppose you want to filter out users based on their age from a JSON. Let’s create a custom function to do this:

Given the JSON:

{
  "users": [
    {"name": "Alice", "age": 28},
    {"name": "Bob", "age": 22},
    {"name": "Charlie", "age": 30}
  ]
}

Command:

echo 'JSON' | jq 'def isAdult(user): user.age >= 18; .users[] | select(isAdult(.))'

Output:

{
  "name": "Alice",
  "age": 28
}
{
  "name": "Bob",
  "age": 22
}
{
  "name": "Charlie",
  "age": 30
}

In this example:

  • The custom function isAdult(user) checks if a user’s age is 18 or more.
  • The select function then filters users based on the custom function.

Recursive Functions

You can also create recursive functions in jq. Let’s create a factorial function as an example:

echo '5' | jq 'def factorial(n): if n == 0 then 1 else n * factorial(n-1) end; factorial(.)'

Output:

120

The factorial(n) function calculates the factorial of a number using recursion.

 

Produce Compact Output

By default, jq pretty-prints its output, making it more readable.

However, if you want the output to be compact without any extra spaces or newlines, you can use the -c or --compact-output option.

Given a JSON:

{
  "users": [
    {"name": "Alice", "age": 28},
    {"name": "Bob", "age": 22}
  ]
}

Command:

echo 'JSON' | jq -c '.'

Output:

{"users":[{"name":"Alice","age":28},{"name":"Bob","age":22}]}

 

Remove Quotes

You can use the -r or --raw-output option with jq to get this raw value without quotes.

For instance, extracting the name of the first user:

echo 'JSON' | jq -r '.users[0].name'

Output:

Alice

 

Handle Missing/Null Values

In some scenarios, you want to retrieve a key’s value that is not present in your data.

The // operator lets you provide a default value if the key is absent or null.

Given the JSON:

{
  "name": "Alice",
  "age": null
}

Command:

echo 'JSON' | jq '.age // "Not provided"'

Output:

"Not provided"

If the key age is absent or its value is null, “Not provided” will be the output.

 

Error Handling

jq has a try-catch block for error handling.

Let’s say you want to convert a string to a number, but you’re not certain if all values are convertible:

Given the JSON:

{
  "value": "123a"
}

Command:

echo 'JSON' | jq 'try (.value | tonumber) catch "Invalid number"'

Output:

"Invalid number"

The try block attempts to convert the value to a number. If it encounters an error, the catch block is executed, providing a custom error message.

Here’s a more complex example, handling both missing keys and value conversion:

echo 'JSON' | jq 'try (.nonexistent | tonumber) catch "Error: \(.)"'

Output:

"Error: null (null) has no keys"

The error message in the catch block can be customized using the . operator, which in this context refers to the error string.

 

How jq Saved the Day

One day, a department head approached me with an urgent request. The marketing department had recently initiated a promotional campaign, and they needed a report detailing the usage statistics of certain services by users.

They handed over a gigantic log file, with each log entry in a JSON format.
The log file was huge in size, with millions of JSON entries.

Traditional methods like using grep or awk. That’s when I remembered jq.
I used jq to filter out the relevant data by applying filters and reducing the file to a much more manageable size. The actual command looked something like this:

cat large_log_file.json | jq 'select(.feature == "desired_feature" and .location == "Egypt")' > filtered_data.json

With a single jq command, I filtered relevant data in just 30 minutes, achieving a 96% efficiency increase over traditional methods.

 

Resource

https://jqlang.github.io/jq/manual/

2 thoughts on “JSON Manipulation Using Linux jq Command
  1. Hi there everybody, here every person is sharing such know-how, therefore
    it’s good to read this blog, and I used to pay a visit this blog everyday.

Leave a Reply

Your email address will not be published. Required fields are marked *