JSON Manipulation and Conversion Techniques in Python

In this tutorial, you’ll learn various JSON processing techniques such as load JSON objects, write, sort JSON, or parse JSON, etc.

JSON stands for JavaScript Object Notation that represents structured data. JSON data is used to exchange information.

In Python, we can use JSON by importing the built-in Python module called json. The json module encodes and decodes JSON data.

 

 

Why use JSON?

JSON contains data that can be read by humans and by machines. The main purpose of using JSON in Python is to store & retrieve lists, tuples, and dictionaries.

Most of the APIs use JSON format to pass information. Similarly, if you have a large set of data, you can encode the data in a JSON format and store it in a database.

The syntax to load this package is as follows:

Syntax:

import json

 

Read JSON file

To read data from a JSON file, we can use the load() method.

Reading JSON data in Python means converting JSON objects to Python objects. The conversion of JSON objects to Python objects is called deserialization. For instance, a JSON array is equivalent to a list in Python.

The syntax for the load() is given below:

Syntax:

data = json.load(object)
  • ‘object’ is the JSON object that will be loaded once the statement is executed and will be stored in the variable ‘data’ as a Python object.

Consider the following JSON object:

Code:

{
	"date": "2021-07-17",
	"firstname": "Hamza",
	"lastname": "Sher",
	"city": "Kyoto",
	"array": [
	    "Carmela",
		"Ashlee",
		"Alisha"
	],
	"array of objects": [
		{
			"index": 0,
			"index start at 5": 5
		},
		{
			"index": 1,
			"index start at 5": 6
		},
		{
			"index": 2,
			"index start at 5": 7
		}
	]
}

The following code prints the values for the key ‘array’ inside our JSON object:

Code:

import json
jsonFile = open('jsonData.json')
data = json.load(jsonFile)
print(data)
jsonFile.close()

Output:

Load JSON data using the load() method

If we have a string that is storing the JSON object, we can use the loads() method to read that string.

Syntax:

data = json.loads(jsonString)

The following code prints the JSON String:

Code:

import json
jsonData = '{"Name": "Hamza", "ID":"12345"}'
data = json.loads(jsonData)
print(data)

Output:

Load JSON data using loads() method

 

Get JSON value

JSON objects are constructed in key-value pairs which makes getting a particular value from the object very simple. We can use dictionary indexing to access the value associated with the key.

Syntax:

data['firstname']

The following code demonstrates how we can use it to get our desired results.

Code:

import json
jsonFile = open('jsonData.json')
data = json.load(jsonFile)
print(data['firstname'])
jsonFile.close()

Output:

Get a value by specifying a key from a json object

 

Update & Delete JSON object

Updating a JSON object in Python is as simple as using the built-in update() function from the json package we have imported.

The update method is used to add a new key-value pair to the JSON string that we declared in our code. We can add a single key-value pair or add a whole dictionary that will be appended to the previous JSON string.

Syntax:

jsonObject.update(KeyValuePair)

The following code implements the update() method.

Code:

import json
jsonData = '{"ID":"123", "Name": "Hamza"}'
data = json.loads(jsonData)
newData = {"DOB": "22-10-2001"}
data.update(newData)
print(data)

Output:

Update json object using json.update() method

The dictionary ‘newData’ has been added to the ‘jsonData’ object. This is how the update() method performs its functionality.

Moving onto the delete functionality. There is no built-in function in the json package to delete a key-value pair. Therefore, we will have to write a little bit more code to perform this function.

Here is how we can implement deletion on a JSON object. Remember we are using the same JSON file that we have been using and have mentioned at the start of this tutorial.

Code:

import json
file = open('jsonData.json', 'r')
data = json.load(file)
file.close()
if 'firstname' in data:
    del data['firstname']
print(data)

Output:

Delete a key value pair from a JSON object

Let us take a look at what’s really happening here. When we put a check to see if ‘firstname’ exists in the dictionary, Python checks the dictionary and if the key exists, we can use the del keyword to delete that key-value pair.

 

Update JSON Value by Key

Here’s how you can do it update a value by key:

json_object = {
    "name": "John",
    "age": 30,
    "city": "New York"
}
json_object["age"] = 35
print(json_object)

Output:

{
    "name": "John",
    "age": 35,
    "city": "New York"
}

The value associated with the key "age" is updated from 30 to 35.

 

Rename JSON Keys

Here’s a common approach to do that in Python:

json_object = {
    "firstname": "John",
    "age": 30
}
json_object["name"] = json_object.pop("firstname")
print(json_object)

Output:

{
    "age": 30,
    "name": "John"
}

The code uses the pop method to remove the key "firstname" and simultaneously retrieve its value. Then, it assigns that value to a new key "name".

The result is that the key "firstname" is effectively renamed to "name".

 

Remove Duplicates

You can use the set() method to remove duplicates from JSON.

Code:

import json
json_array = [
    {"name": "John", "age": 30},
    {"name": "Jane", "age": 25},
    {"name": "John", "age": 30}
]

# Convert JSON objects to strings to make them hashable
json_strings = [json.dumps(item, sort_keys=True) for item in json_array]

# Use set to remove duplicates
unique_json_strings = set(json_strings)

# Convert back to JSON objects
unique_json_array = [json.loads(item) for item in unique_json_strings]
print(unique_json_array)

Output:

[
    {"name": "Jane", "age": 25},
    {"name": "John", "age": 30}
]

First, we convert each JSON object in the array to a string using json.dumps, making sure to sort the keys to ensure consistent ordering.

Then, we use a set to remove duplicates, as sets cannot contain duplicate elements.

Finally, we convert the unique JSON strings back to JSON objects using json.loads.

 

Sort JSON

We can sort a JSON object alphabetically based on the keys. To do this, we use the json.dumps() method along with a few arguments to the method. The syntax to use this method is as follows:

Syntax:

json.dumps(data, sort_keys=True)

Here we pass two arguments to the function json.dumps(). The first one ‘data’ contains the JSON object that we stored in a Python variable.

The second is the sort_keys argument, that when set to True, sorts the data alphabetically and returns the JSON object as a string. The following code uses this functionality:

Code:

import json
file = open('jsonData.json', 'r')
data = json.load(file)
file.close()
print(json.dumps(data, sort_keys=True))

Output:

Sort a JSON File using dumps() and sort_keys=True argument

Looking at the code, it’s fairly easy to understand what is going on. First, we are loading the data and storing it into the variable ‘data’, and closing the file afterward.

Then in a single statement, we print the sorted data with the help of the function json.dumps() and the sort_keys=True argument.

 

Create JSON objects

To create a JSON object, we need to have a Python dictionary that will contain our data. We will use the same method as we used before i.e., json.dump() and json.loads(). The following code implements this functionality:

Code:

import json
data = {"Name":"John Doe", "ID":"123"}
json_dump = json.dumps(data)
json_data = json.loads(json_dump)
print(json_data)

Output:

Create a JSON object

Here we define some data as a Python dictionary. Then we use the json.dumps() method and pass the Python dictionary as an argument.

This converts our Python dictionary into a string that can be passed to the json.loads() method. Then the json.loads() method converts this string into a JSON Object and we can see the output when it is printed.

 

Write JSON to file

To write a JSON object into a JSON file, we can use the json.dump() method. This method takes the data that we will write to the file and also the file that we will write the data into. The following code explains how we can do just that!

Code:

import json
file = open('jsonData.json', 'r')
data = json.load(file)
file.close()
newData = {"DOB": "22-10-2001"}
data.update(newData)
file = open('jsonData.json', 'w')
json.dump(data, file)
file.close()
print(data)

Output:

Write to json file using dump()

First, we open the file in read mode and store the contents of the file into the variable ‘data’. Then we update the ‘data’ and add the new key-value pair into this variable.

After that, we open the file again in write mode. We use the json.dump() function and pass it to the data and file as parameters and close the file afterward.

The output shows that the data has been updated and we can confirm this by looking at the json file.

 

Pretty Printing JSON

You can use the indent parameter to make the output JSON better printed.

Code:

import json
data = {
  "name": "John",
  "age": 30,
  "city": "New York",
  "hasChildren": False,
  "titles": ["engineer", "programmer"]
}
pretty_json = json.dumps(data, indent=4)
print(pretty_json)

Output:

{
    "name": "John",
    "age": 30,
    "city": "New York",
    "hasChildren": false,
    "titles": [
        "engineer",
        "programmer"
    ]
}

The code uses the json.dumps function with the indent parameter set to 4.

This causes the output JSON string to be formatted with an indentation of four spaces at each level, making it more readable.

 

Using Separators Parameter

You can use the separators parameter to control the delimiters between items in the JSON output.

This can be helpful in various formatting situations.

Code:

import json
data = {
    "name": "John",
    "age": 30,
    "city": "New York"
}
custom_separators_json = json.dumps(data, separators=(',', ':'))
print(custom_separators_json)

Output:

{"name":"John","age":30,"city":"New York"}

Here, the separators parameter is set to a tuple (',', ':'), which specifies the separators to use between items and key-value pairs in the JSON string.

By using this customization, you can reduce the amount of whitespace in the JSON output, making it more compact.

 

Parse JSON

We can parse a JSON string in Python by simply using the json.loads() method. This method converts the JSON string into a Python dictionary that can be indexed based on the keys present in the dictionary.

Syntax:

json.loads(jsonString)

Here ‘jsonString’ is the JSON string that is passed into the method as an argument. The method will parse the JSON string and return a Python dictionary that can be further stored in a variable.

We can also perform all the dictionary methods on this variable. The following code implements this functionality.

Code:

import json
data = '{"Name":"John Doe", "ID":"123"}'
json_data = json.loads(data)
print(json_data['Name'])

Output:

Parse json object using keys

In this code, we are passing the JSON string ‘data’ as the argument to the method json.loads() that returns a dictionary which is stored in the variable ‘json_data’. The print function verifies that the method ran successfully.

 

Validating JSON

JSON validation is the process of checking whether a JSON object conforms to a predefined schema or structure.

You can use libraries such as jsonschema to validate JSON data. Here’s an example:

Install jsonschema library:

pip install jsonschema

Validate a JSON object:

from jsonschema import validate
schema = {
    "type": "object",
    "properties": {
        "name": {"type": "string"},
        "age": {"type": "integer"}
    },
    "required": ["name", "age"]
}
data = {"name": "John", "age": 30}
validate(instance=data, schema=schema)

Output:
No exception is thrown; the data conforms to the schema.

The code snippet defines a schema that requires an object with a string name and an integer age.

The validate function from the jsonschema library is then used to check if the data object meets this schema.

Since the data object complies with the defined schema, the code runs without any errors.

 

Merging JSON Objects

In Python, merging JSON objects can be done using standard dictionary merging techniques, since a JSON object in Python is represented as a dictionary. Here’s how you can combine two JSON objects:

json_object1 = {
    "name": "John",
    "age": 30
}
json_object2 = {
    "city": "New York",
    "hasChildren": False
}
merged_json_object = {**json_object1, **json_object2}
print(merged_json_object)

Output:

{
    "name": "John",
    "age": 30,
    "city": "New York",
    "hasChildren": false
}

Here, two JSON objects, json_object1 and json_object2, are merged using the spread syntax {**json_object1, **json_object2}.

The resulting merged_json_object contains all the key-value pairs from both original objects.

 

Object to JSON

Python objects can be converted into JSON using the same json.dumps() method that we have discussed earlier. Let’s take a look at how this will be done.

Code:

import json
class Car:
    def __init__(self, model, make, engine_capacity):
        self.model = model
        self.make = make
        self.engine_capacity = engine_capacity
car_1 = Car('2001', 'Honda', '1.8L')
json_data = json.dumps(car_1.__dict__)
print(json_data)

Output:

Convert a Python object into a JSON object

In this code, we first create a class Car and then create an object of this class.

Then we will use the json.dumps() function and pass the car object as ‘car.__dict__’. The ‘__dict__’ converts all the member variables to a dictionary and passes it to the json.dumps() method.

As we can see from the output the object has been converted to JSON.

 

JSON to object

To convert a JSON string to a Python object, we will need a class whose object we have to create and use the json.loads() method as follows:

Code:

import json
class Car:
    def __init__(self, model, make, engine_capacity):
        self.model = model
        self.make = make
        self.engine_capacity = engine_capacity
json_data = '{"model": "2001", "make": "Honda", "engine_capacity": "1.8L"}'
data = json.loads(json_data)
car_1 = Car(**data)
print(car_1.engine_capacity, car_1.make, car_1.model)

Output:

Convert JSON object to Python object

Here we have loaded the data into the variable ‘data’ and then passed this dictionary to the car class as keyword argument. We can see in the output that the object has been created.

 

Bytes to JSON

Converting a byte string or dictionary into JSON is very simple. We just have to use the built-in json.dumps() function.

Syntax:

json.dumps(bytesString)

The following code illustrates this functionality.

Code:

import json
byte_str = b'{"Name":"Hamza", "ID":"123"}'
dec_str = byte_str.decode('utf-8')
data = json.dumps(dec_str)
print(data)

Output:

Convert Byte String to JSON object

Here we have first defined a byte string and then decoded it into ‘utf-8’ character set. After that, we have simply used json.dumps() to convert the string into JSON String.

 

Convert HTML to JSON

To convert HTML into a JSON object, we will have to use another Python package called html-to-json. What this package does is basically take an HTML file and convert it into a JSON object.

We can install this package by using the following command in our command prompt or terminal:

Syntax:

pip install html-to-json

First, we need to import it into our program.

Syntax:

import html_to_json

After importing we can now write our code to convert the HTML file to a JSON object. Here is the sample HTML file we’ll be using:

Code:

<!doctype html>
<html lang="en-US">
    <head>
        <title>Sample Html Doc</title>
    </head>
    <body>
        <div>
            <h1>First Heading</h2>
            <p>This is a sample HTML Doc</p>
        </div>
    </body>
</html>

Now, we’ll move on to writing the code to convert this HTML into JSON.

Code:

import json
import html_to_json
file = open("sample.html", "r")
html = file.read()
file.close()
output_json = html_to_json.convert(html)
print(output_json)

Output:

Convert an HTML Doc into JSON Object

In this code, we are used html-to-json package to convert the HTML to json. We used the html_to_json.convert() method for this purpose and pass the string containing the desired HTML.

 

JSON to SQL

To convert a JSON object into a SQL table takes a few extra steps than just using a single method. Here we are using two new packages that we have not used before.

First is the Pandas package that is a data analysis tool. We are just going to use it to convert our JSON object into a Pandas DataFrame.

The second package is sqlalchemy. This package is a database toolkit and an object-relational mapper(ORM). Here’s how we import can import these packages:

Syntax:

import pandas as pd
from sqlalchemy import create_engine

Here create_engine is a method that helps us connect to the SQLite database. The following code illustrates this functionality:

Code:

import json
import pandas as pd
from sqlalchemy import create_engine
with open("jsonsample.json") as f:
    data = json.load(f)
df = pd.DataFrame(data)
engine = create_engine("sqlite:///my_data.db")
df.to_sql("Sample_Data", con=engine)

When we run this code, a database named ‘my_data.db’ is created. After that, the data is inserted into the database under the table name ‘Sample_Data’.

We can confirm this by running the following commands in our command prompt or terminal:

Code:

$ sqlite my_data.db
sqlite> .schema

Convert a JSON object into a SQL table using sqlalchemy and sqlite database

Depending on the JSON object, you can see that the table has been created and the data has been inserted.

 

JSON load() VS loads()

The difference between both of these is that with the load() method, we pass the JSON file as the argument, and then we can store it in a variable.

While the loads() method we pass a JSON string that is defined as a Python variable and serializes that string into a JSON object. The following code samples display this functionality.

Code: (load())

import json
jsonFile = open('jsonData.json')
data = json.load(jsonFile)
print(data)
jsonFile.close()

Output:

JSON load() Method Example

Code: (loads())

import json
jsonData = '{"Name": "Hamza", "ID":"12345"}'
data = json.loads(jsonData)
print(data)

Output:

JSON loads() Method Example

 

JSON dumps() VS loads()

The json.loads() and json.dumps() methods are opposites. The json.loads() method takes a string and returns a JSON object that can be used further.

Whereas the json.dumps() method takes a JSON object and returns a string that contains all of the data.

The following code samples illustrate this functionality:

Code:

import json
json_data = '{"Name":"Hamza", "ID":"123"}'
data = json.loads(json_data)
print("loads method: ", data)
dumps_data = json.dumps(data)
print("dumps method: ", dumps_data)

Output:

JSON loads() vs dumps() Method

I hope you like the tutorial. Keep coming back.

Leave a Reply

Your email address will not be published. Required fields are marked *