JSON Manipulation and Conversion Techniques in Python
In this tutorial, you’ll learn various JSON processing techniques such as load JSON objects, write, sort JSON, or parse JSON, etc.
JSON stands for JavaScript Object Notation that represents structured data. JSON data is used to exchange information.
In Python, we can use JSON by importing the built-in Python module called json. The json module encodes and decodes JSON data.
- 1 Why use JSON?
- 2 Read JSON file
- 3 Get JSON value
- 4 Update & Delete JSON object
- 5 Update JSON Value by Key
- 6 Rename JSON Keys
- 7 Remove Duplicates
- 8 Sort JSON
- 9 Create JSON objects
- 10 Write JSON to file
- 11 Pretty Printing JSON
- 12 Using Separators Parameter
- 13 Parse JSON
- 14 Validating JSON
- 15 Merging JSON Objects
- 16 Object to JSON
- 17 JSON to object
- 18 Bytes to JSON
- 19 Convert HTML to JSON
- 20 JSON to SQL
- 21 JSON load() VS loads()
- 22 JSON dumps() VS loads()
Why use JSON?
JSON contains data that can be read by humans and by machines. The main purpose of using JSON in Python is to store & retrieve lists, tuples, and dictionaries.
Most of the APIs use JSON format to pass information. Similarly, if you have a large set of data, you can encode the data in a JSON format and store it in a database.
The syntax to load this package is as follows:
Syntax:
import json
Read JSON file
To read data from a JSON file, we can use the load() method.
Reading JSON data in Python means converting JSON objects to Python objects. The conversion of JSON objects to Python objects is called deserialization. For instance, a JSON array is equivalent to a list in Python.
The syntax for the load() is given below:
Syntax:
data = json.load(object)
- ‘object’ is the JSON object that will be loaded once the statement is executed and will be stored in the variable ‘data’ as a Python object.
Consider the following JSON object:
Code:
{ "date": "2021-07-17", "firstname": "Hamza", "lastname": "Sher", "city": "Kyoto", "array": [ "Carmela", "Ashlee", "Alisha" ], "array of objects": [ { "index": 0, "index start at 5": 5 }, { "index": 1, "index start at 5": 6 }, { "index": 2, "index start at 5": 7 } ] }
The following code prints the values for the key ‘array’ inside our JSON object:
Code:
import json jsonFile = open('jsonData.json') data = json.load(jsonFile) print(data) jsonFile.close()
Output:
If we have a string that is storing the JSON object, we can use the loads() method to read that string.
Syntax:
data = json.loads(jsonString)
The following code prints the JSON String:
Code:
import json jsonData = '{"Name": "Hamza", "ID":"12345"}' data = json.loads(jsonData) print(data)
Output:
Get JSON value
JSON objects are constructed in key-value pairs which makes getting a particular value from the object very simple. We can use dictionary indexing to access the value associated with the key.
Syntax:
data['firstname']
The following code demonstrates how we can use it to get our desired results.
Code:
import json jsonFile = open('jsonData.json') data = json.load(jsonFile) print(data['firstname']) jsonFile.close()
Output:
Update & Delete JSON object
Updating a JSON object in Python is as simple as using the built-in update() function from the json package we have imported.
The update method is used to add a new key-value pair to the JSON string that we declared in our code. We can add a single key-value pair or add a whole dictionary that will be appended to the previous JSON string.
Syntax:
jsonObject.update(KeyValuePair)
The following code implements the update() method.
Code:
import json jsonData = '{"ID":"123", "Name": "Hamza"}' data = json.loads(jsonData) newData = {"DOB": "22-10-2001"} data.update(newData) print(data)
Output:
The dictionary ‘newData’ has been added to the ‘jsonData’ object. This is how the update() method performs its functionality.
Moving onto the delete functionality. There is no built-in function in the json package to delete a key-value pair. Therefore, we will have to write a little bit more code to perform this function.
Here is how we can implement deletion on a JSON object. Remember we are using the same JSON file that we have been using and have mentioned at the start of this tutorial.
Code:
import json file = open('jsonData.json', 'r') data = json.load(file) file.close() if 'firstname' in data: del data['firstname'] print(data)
Output:
Let us take a look at what’s really happening here. When we put a check to see if ‘firstname’ exists in the dictionary, Python checks the dictionary and if the key exists, we can use the del keyword to delete that key-value pair.
Update JSON Value by Key
Here’s how you can do it update a value by key:
json_object = { "name": "John", "age": 30, "city": "New York" } json_object["age"] = 35 print(json_object)
Output:
{ "name": "John", "age": 35, "city": "New York" }
The value associated with the key "age"
is updated from 30
to 35
.
Rename JSON Keys
Here’s a common approach to do that in Python:
json_object = { "firstname": "John", "age": 30 } json_object["name"] = json_object.pop("firstname") print(json_object)
Output:
{ "age": 30, "name": "John" }
The code uses the pop
method to remove the key "firstname"
and simultaneously retrieve its value. Then, it assigns that value to a new key "name"
.
The result is that the key "firstname"
is effectively renamed to "name"
.
Remove Duplicates
You can use the set() method to remove duplicates from JSON.
Code:
import json json_array = [ {"name": "John", "age": 30}, {"name": "Jane", "age": 25}, {"name": "John", "age": 30} ] # Convert JSON objects to strings to make them hashable json_strings = [json.dumps(item, sort_keys=True) for item in json_array] # Use set to remove duplicates unique_json_strings = set(json_strings) # Convert back to JSON objects unique_json_array = [json.loads(item) for item in unique_json_strings] print(unique_json_array)
Output:
[ {"name": "Jane", "age": 25}, {"name": "John", "age": 30} ]
First, we convert each JSON object in the array to a string using json.dumps
, making sure to sort the keys to ensure consistent ordering.
Then, we use a set to remove duplicates, as sets cannot contain duplicate elements.
Finally, we convert the unique JSON strings back to JSON objects using json.loads
.
Sort JSON
We can sort a JSON object alphabetically based on the keys. To do this, we use the json.dumps() method along with a few arguments to the method. The syntax to use this method is as follows:
Syntax:
json.dumps(data, sort_keys=True)
Here we pass two arguments to the function json.dumps(). The first one ‘data’ contains the JSON object that we stored in a Python variable.
The second is the sort_keys argument, that when set to True, sorts the data alphabetically and returns the JSON object as a string. The following code uses this functionality:
Code:
import json file = open('jsonData.json', 'r') data = json.load(file) file.close() print(json.dumps(data, sort_keys=True))
Output:
Looking at the code, it’s fairly easy to understand what is going on. First, we are loading the data and storing it into the variable ‘data’, and closing the file afterward.
Then in a single statement, we print the sorted data with the help of the function json.dumps() and the sort_keys=True argument.
Create JSON objects
To create a JSON object, we need to have a Python dictionary that will contain our data. We will use the same method as we used before i.e., json.dump() and json.loads(). The following code implements this functionality:
Code:
import json data = {"Name":"John Doe", "ID":"123"} json_dump = json.dumps(data) json_data = json.loads(json_dump) print(json_data)
Output:
Here we define some data as a Python dictionary. Then we use the json.dumps() method and pass the Python dictionary as an argument.
This converts our Python dictionary into a string that can be passed to the json.loads() method. Then the json.loads() method converts this string into a JSON Object and we can see the output when it is printed.
Write JSON to file
To write a JSON object into a JSON file, we can use the json.dump() method. This method takes the data that we will write to the file and also the file that we will write the data into. The following code explains how we can do just that!
Code:
import json file = open('jsonData.json', 'r') data = json.load(file) file.close() newData = {"DOB": "22-10-2001"} data.update(newData) file = open('jsonData.json', 'w') json.dump(data, file) file.close() print(data)
Output:
First, we open the file in read mode and store the contents of the file into the variable ‘data’. Then we update the ‘data’ and add the new key-value pair into this variable.
After that, we open the file again in write mode. We use the json.dump() function and pass it to the data and file as parameters and close the file afterward.
The output shows that the data has been updated and we can confirm this by looking at the json file.
Pretty Printing JSON
You can use the indent
parameter to make the output JSON better printed.
Code:
import json data = { "name": "John", "age": 30, "city": "New York", "hasChildren": False, "titles": ["engineer", "programmer"] } pretty_json = json.dumps(data, indent=4) print(pretty_json)
Output:
{ "name": "John", "age": 30, "city": "New York", "hasChildren": false, "titles": [ "engineer", "programmer" ] }
The code uses the json.dumps
function with the indent
parameter set to 4.
This causes the output JSON string to be formatted with an indentation of four spaces at each level, making it more readable.
Using Separators Parameter
You can use the separators
parameter to control the delimiters between items in the JSON output.
This can be helpful in various formatting situations.
Code:
import json data = { "name": "John", "age": 30, "city": "New York" } custom_separators_json = json.dumps(data, separators=(',', ':')) print(custom_separators_json)
Output:
{"name":"John","age":30,"city":"New York"}
Here, the separators
parameter is set to a tuple (',', ':')
, which specifies the separators to use between items and key-value pairs in the JSON string.
By using this customization, you can reduce the amount of whitespace in the JSON output, making it more compact.
Parse JSON
We can parse a JSON string in Python by simply using the json.loads() method. This method converts the JSON string into a Python dictionary that can be indexed based on the keys present in the dictionary.
Syntax:
json.loads(jsonString)
Here ‘jsonString’ is the JSON string that is passed into the method as an argument. The method will parse the JSON string and return a Python dictionary that can be further stored in a variable.
We can also perform all the dictionary methods on this variable. The following code implements this functionality.
Code:
import json data = '{"Name":"John Doe", "ID":"123"}' json_data = json.loads(data) print(json_data['Name'])
Output:
In this code, we are passing the JSON string ‘data’ as the argument to the method json.loads() that returns a dictionary which is stored in the variable ‘json_data’. The print function verifies that the method ran successfully.
Validating JSON
JSON validation is the process of checking whether a JSON object conforms to a predefined schema or structure.
You can use libraries such as jsonschema
to validate JSON data. Here’s an example:
Install jsonschema library:
pip install jsonschema
Validate a JSON object:
from jsonschema import validate schema = { "type": "object", "properties": { "name": {"type": "string"}, "age": {"type": "integer"} }, "required": ["name", "age"] } data = {"name": "John", "age": 30} validate(instance=data, schema=schema)
Output:
No exception is thrown; the data conforms to the schema.
The code snippet defines a schema that requires an object with a string name and an integer age.
The validate
function from the jsonschema
library is then used to check if the data
object meets this schema.
Since the data
object complies with the defined schema, the code runs without any errors.
Merging JSON Objects
In Python, merging JSON objects can be done using standard dictionary merging techniques, since a JSON object in Python is represented as a dictionary. Here’s how you can combine two JSON objects:
json_object1 = { "name": "John", "age": 30 } json_object2 = { "city": "New York", "hasChildren": False } merged_json_object = {**json_object1, **json_object2} print(merged_json_object)
Output:
{ "name": "John", "age": 30, "city": "New York", "hasChildren": false }
Here, two JSON objects, json_object1
and json_object2
, are merged using the spread syntax {**json_object1, **json_object2}
.
The resulting merged_json_object
contains all the key-value pairs from both original objects.
Object to JSON
Python objects can be converted into JSON using the same json.dumps() method that we have discussed earlier. Let’s take a look at how this will be done.
Code:
import json class Car: def __init__(self, model, make, engine_capacity): self.model = model self.make = make self.engine_capacity = engine_capacity car_1 = Car('2001', 'Honda', '1.8L') json_data = json.dumps(car_1.__dict__) print(json_data)
Output:
In this code, we first create a class Car and then create an object of this class.
Then we will use the json.dumps() function and pass the car object as ‘car.__dict__’. The ‘__dict__’ converts all the member variables to a dictionary and passes it to the json.dumps() method.
As we can see from the output the object has been converted to JSON.
JSON to object
To convert a JSON string to a Python object, we will need a class whose object we have to create and use the json.loads() method as follows:
Code:
import json class Car: def __init__(self, model, make, engine_capacity): self.model = model self.make = make self.engine_capacity = engine_capacity json_data = '{"model": "2001", "make": "Honda", "engine_capacity": "1.8L"}' data = json.loads(json_data) car_1 = Car(**data) print(car_1.engine_capacity, car_1.make, car_1.model)
Output:
Here we have loaded the data into the variable ‘data’ and then passed this dictionary to the car class as keyword argument. We can see in the output that the object has been created.
Bytes to JSON
Converting a byte string or dictionary into JSON is very simple. We just have to use the built-in json.dumps() function.
Syntax:
json.dumps(bytesString)
The following code illustrates this functionality.
Code:
import json byte_str = b'{"Name":"Hamza", "ID":"123"}' dec_str = byte_str.decode('utf-8') data = json.dumps(dec_str) print(data)
Output:
Here we have first defined a byte string and then decoded it into ‘utf-8’ character set. After that, we have simply used json.dumps() to convert the string into JSON String.
Convert HTML to JSON
To convert HTML into a JSON object, we will have to use another Python package called html-to-json. What this package does is basically take an HTML file and convert it into a JSON object.
We can install this package by using the following command in our command prompt or terminal:
Syntax:
pip install html-to-json
First, we need to import it into our program.
Syntax:
import html_to_json
After importing we can now write our code to convert the HTML file to a JSON object. Here is the sample HTML file we’ll be using:
Code:
<!doctype html> <html lang="en-US"> <head> <title>Sample Html Doc</title> </head> <body> <div> <h1>First Heading</h2> <p>This is a sample HTML Doc</p> </div> </body> </html>
Now, we’ll move on to writing the code to convert this HTML into JSON.
Code:
import json import html_to_json file = open("sample.html", "r") html = file.read() file.close() output_json = html_to_json.convert(html) print(output_json)
Output:
In this code, we are used html-to-json package to convert the HTML to json. We used the html_to_json.convert() method for this purpose and pass the string containing the desired HTML.
JSON to SQL
To convert a JSON object into a SQL table takes a few extra steps than just using a single method. Here we are using two new packages that we have not used before.
First is the Pandas package that is a data analysis tool. We are just going to use it to convert our JSON object into a Pandas DataFrame.
The second package is sqlalchemy. This package is a database toolkit and an object-relational mapper(ORM). Here’s how we import can import these packages:
Syntax:
import pandas as pd from sqlalchemy import create_engine
Here create_engine is a method that helps us connect to the SQLite database. The following code illustrates this functionality:
Code:
import json import pandas as pd from sqlalchemy import create_engine with open("jsonsample.json") as f: data = json.load(f) df = pd.DataFrame(data) engine = create_engine("sqlite:///my_data.db") df.to_sql("Sample_Data", con=engine)
When we run this code, a database named ‘my_data.db’ is created. After that, the data is inserted into the database under the table name ‘Sample_Data’.
We can confirm this by running the following commands in our command prompt or terminal:
Code:
$ sqlite my_data.db sqlite> .schema
Depending on the JSON object, you can see that the table has been created and the data has been inserted.
JSON load() VS loads()
The difference between both of these is that with the load() method, we pass the JSON file as the argument, and then we can store it in a variable.
While the loads() method we pass a JSON string that is defined as a Python variable and serializes that string into a JSON object. The following code samples display this functionality.
Code: (load())
import json jsonFile = open('jsonData.json') data = json.load(jsonFile) print(data) jsonFile.close()
Output:
Code: (loads())
import json jsonData = '{"Name": "Hamza", "ID":"12345"}' data = json.loads(jsonData) print(data)
Output:
JSON dumps() VS loads()
The json.loads() and json.dumps() methods are opposites. The json.loads() method takes a string and returns a JSON object that can be used further.
Whereas the json.dumps() method takes a JSON object and returns a string that contains all of the data.
The following code samples illustrate this functionality:
Code:
import json json_data = '{"Name":"Hamza", "ID":"123"}' data = json.loads(json_data) print("loads method: ", data) dumps_data = json.dumps(data) print("dumps method: ", dumps_data)
Output:
I hope you like the tutorial. Keep coming back.
Mokhtar is the founder of LikeGeeks.com. He is a seasoned technologist and accomplished author, with expertise in Linux system administration and Python development. Since 2010, Mokhtar has built an impressive career, transitioning from system administration to Python development in 2015. His work spans large corporations to freelance clients around the globe. Alongside his technical work, Mokhtar has authored some insightful books in his field. Known for his innovative solutions, meticulous attention to detail, and high-quality work, Mokhtar continually seeks new challenges within the dynamic field of technology.