5 Methods To Split JSON array in Python
In this tutorial, you’ll learn various methods to split JSON arrays in Python.
You’ll learn about list slicing, condition-based splitting, using libraries like NumPy and Pandas, and more.
Using List Slicing
First, assume you have a JSON array like this:
import json json_data = ''' [ {'id': 1, 'name': 'Customer A', 'plan': 'Premium'}, {'id': 2, 'name': 'Customer B', 'plan': 'Basic'}, {'id': 3, 'name': 'Customer C', 'plan': 'Standard'}, {'id': 4, 'name': 'Customer D', 'plan': 'Premium'}, {'id': 5, 'name': 'Customer E', 'plan': 'Basic'}, {'id': 6, 'name': 'Customer F', 'plan': 'Standard'}, {'id': 7, 'name': 'Customer G', 'plan': 'Premium'}, {'id': 8, 'name': 'Customer H', 'plan': 'Basic'}, {'id': 9, 'name': 'Customer I', 'plan': 'Standard'}, {'id': 10, 'name': 'Customer J', 'plan': 'Premium'} ] ''' customers = json.loads(json_data)
Now, let’s say you want to split this array into two parts. For simplicity, let’s split it in the middle.
# Splitting the array mid_index = len(customers) // 2 first_half = customers[:mid_index] second_half = customers[mid_index:] print("First half:", first_half) print("Second half:", second_half)
Output:
First half: [{'id': 1, 'name': 'Customer A', 'plan': 'Premium'}, {'id': 2, 'name': 'Customer B', 'plan': 'Basic'}, {'id': 3, 'name': 'Customer C', 'plan': 'Standard'}, {'id': 4, 'name': 'Customer D', 'plan': 'Premium'}, {'id': 5, 'name': 'Customer E', 'plan': 'Basic'}] Second half: [{'id': 6, 'name': 'Customer F', 'plan': 'Standard'}, {'id': 7, 'name': 'Customer G', 'plan': 'Premium'}, {'id': 8, 'name': 'Customer H', 'plan': 'Basic'}, {'id': 9, 'name': 'Customer I', 'plan': 'Standard'}, {'id': 10, 'name': 'Customer J', 'plan': 'Premium'}]
In this output, the original array has been divided into two halves.
The mid_index
determines the splitting point, ensuring an even distribution of data between first_half
and second_half
.
Using List Comprehensions
You can split JSON array based on conditions using list comprehensions.
Suppose you want to separate customers into two groups based on their plan: ‘Premium’ and others.
premium_customers = [customer for customer in customers if customer['plan'] == 'Premium'] other_customers = [customer for customer in customers if customer['plan'] != 'Premium'] print("Premium Customers:", premium_customers) print("Other Customers:", other_customers)
Output:
Premium Customers: [{'id': 1, 'name': 'Customer A', 'plan': 'Premium'}, {'id': 4, 'name': 'Customer D', 'plan': 'Premium'}, {'id': 7, 'name': 'Customer G', 'plan': 'Premium'}, {'id': 10, 'name': 'Customer J', 'plan': 'Premium'}] Other Customers: [{'id': 2, 'name': 'Customer B', 'plan': 'Basic'}, {'id': 3, 'name': 'Customer C', 'plan': 'Standard'}, {'id': 5, 'name': 'Customer E', 'plan': 'Basic'}, {'id': 6, 'name': 'Customer F', 'plan': 'Standard'}, {'id': 8, 'name': 'Customer H', 'plan': 'Basic'}, {'id': 9, 'name': 'Customer I', 'plan': 'Standard'}]
Two new lists: premium_customers
and other_customers
, are created.
Using the numpy.array_split
The NumPy array_split
is useful when you need to divide data into nearly equal parts, even when it can’t be divided evenly.
First, ensure you have numpy
installed:
pip install numpy
Now, let’s apply numpy.array_split
to our customer data. Assume the same JSON data as before, converted into a Python list.
import numpy as np split_arrays = np.array_split(customers, 3) for i, array in enumerate(split_arrays): print(f"Part {i+1}:", array.tolist())
Output:
Part 1: [{'id': 1, 'name': 'Customer A', 'plan': 'Premium'}, {'id': 2, 'name': 'Customer B', 'plan': 'Basic'}, {'id': 3, 'name': 'Customer C', 'plan': 'Standard'}, {'id': 4, 'name': 'Customer D', 'plan': 'Premium'}] Part 2: [{'id': 5, 'name': 'Customer E', 'plan': 'Basic'}, {'id': 6, 'name': 'Customer F', 'plan': 'Standard'}, {'id': 7, 'name': 'Customer G', 'plan': 'Premium'}] Part 3: [{'id': 8, 'name': 'Customer H', 'plan': 'Basic'}, {'id': 9, 'name': 'Customer I', 'plan': 'Standard'}, {'id': 10, 'name': 'Customer J', 'plan': 'Premium'}]
In this example, numpy.array_split
divides the list of customers into three parts. Unlike simple list slicing, numpy.array_split
can handle uneven divisions.
Using Iterative Splitting
The iterative splitting is useful in cases where the division logic needs to be dynamically determined.
Suppose you want to split the customer data into multiple groups where each group has a mix of different plan types. Here’s how you can do it:
def iterative_split(data, group_size): groups = [] temp_group = [] for item in data: temp_group.append(item) if len(temp_group) == group_size: groups.append(temp_group) temp_group = [] if temp_group: groups.append(temp_group) return groups grouped_customers = iterative_split(customers, 3) for i, group in enumerate(grouped_customers): print(f"Group {i+1}:", group)
Output:
Group 1: [{'id': 1, 'name': 'Customer A', 'plan': 'Premium'}, {'id': 2, 'name': 'Customer B', 'plan': 'Basic'}, {'id': 3, 'name': 'Customer C', 'plan': 'Standard'}] Group 2: [{'id': 4, 'name': 'Customer D', 'plan': 'Premium'}, {'id': 5, 'name': 'Customer E', 'plan': 'Basic'}, {'id': 6, 'name': 'Customer F', 'plan': 'Standard'}] Group 3: [{'id': 7, 'name': 'Customer G', 'plan': 'Premium'}, {'id': 8, 'name': 'Customer H', 'plan': 'Basic'}, {'id': 9, 'name': 'Customer I', 'plan': 'Standard'}] Group 4: [{'id': 10, 'name': 'Customer J', 'plan': 'Premium'}]
In this example, iterative_split
is a function that takes a list of data and a group size.
Using Pandas GroupBy
The GroupBy
function from Pandas allows for segmenting the data into groups based on some criteria and applying a function to each group independently.
First, make sure Pandas is installed:
pip install pandas
Now, let’s use GroupBy
to split our customer data based on their subscription plan.
import pandas as pd # Converting list to Pandas DataFrame customers_df = pd.DataFrame(customers) grouped_customers = customers_df.groupby('plan') for plan, group in grouped_customers: print(f"Plan: {plan}") print(group)
Output:
Plan: Basic id name plan 1 2 Customer B Basic 4 5 Customer E Basic 7 8 Customer H Basic Plan: Premium id name plan 0 1 Customer A Premium 3 4 Customer D Premium 6 7 Customer G Premium 9 10 Customer J Premium // ... additional plans ...
In this case, groupby('plan')
groups the data based on the ‘plan’ column.
Each group then contains only the rows from the DataFrame that share the same plan value.
Mokhtar is the founder of LikeGeeks.com. He is a seasoned technologist and accomplished author, with expertise in Linux system administration and Python development. Since 2010, Mokhtar has built an impressive career, transitioning from system administration to Python development in 2015. His work spans large corporations to freelance clients around the globe. Alongside his technical work, Mokhtar has authored some insightful books in his field. Known for his innovative solutions, meticulous attention to detail, and high-quality work, Mokhtar continually seeks new challenges within the dynamic field of technology.