5 Methods To Split JSON array in Python

In this tutorial, you’ll learn various methods to split JSON arrays in Python.

You’ll learn about list slicing, condition-based splitting, using libraries like NumPy and Pandas, and more.

 

 

 

Using List Slicing

First, assume you have a JSON array like this:

import json
json_data = '''
[
    {'id': 1, 'name': 'Customer A', 'plan': 'Premium'},
    {'id': 2, 'name': 'Customer B', 'plan': 'Basic'},
    {'id': 3, 'name': 'Customer C', 'plan': 'Standard'},
    {'id': 4, 'name': 'Customer D', 'plan': 'Premium'},
    {'id': 5, 'name': 'Customer E', 'plan': 'Basic'},
    {'id': 6, 'name': 'Customer F', 'plan': 'Standard'},
    {'id': 7, 'name': 'Customer G', 'plan': 'Premium'},
    {'id': 8, 'name': 'Customer H', 'plan': 'Basic'},
    {'id': 9, 'name': 'Customer I', 'plan': 'Standard'},
    {'id': 10, 'name': 'Customer J', 'plan': 'Premium'}
]
'''
customers = json.loads(json_data)

Now, let’s say you want to split this array into two parts. For simplicity, let’s split it in the middle.

# Splitting the array
mid_index = len(customers) // 2
first_half = customers[:mid_index]
second_half = customers[mid_index:]
print("First half:", first_half)
print("Second half:", second_half)

Output:

First half: [{'id': 1, 'name': 'Customer A', 'plan': 'Premium'}, {'id': 2, 'name': 'Customer B', 'plan': 'Basic'}, {'id': 3, 'name': 'Customer C', 'plan': 'Standard'}, {'id': 4, 'name': 'Customer D', 'plan': 'Premium'}, {'id': 5, 'name': 'Customer E', 'plan': 'Basic'}]
Second half: [{'id': 6, 'name': 'Customer F', 'plan': 'Standard'}, {'id': 7, 'name': 'Customer G', 'plan': 'Premium'}, {'id': 8, 'name': 'Customer H', 'plan': 'Basic'}, {'id': 9, 'name': 'Customer I', 'plan': 'Standard'}, {'id': 10, 'name': 'Customer J', 'plan': 'Premium'}]

In this output, the original array has been divided into two halves.

The mid_index determines the splitting point, ensuring an even distribution of data between first_half and second_half.

 

Using List Comprehensions

You can split JSON array based on conditions using list comprehensions.

Suppose you want to separate customers into two groups based on their plan: ‘Premium’ and others.

premium_customers = [customer for customer in customers if customer['plan'] == 'Premium']
other_customers = [customer for customer in customers if customer['plan'] != 'Premium']
print("Premium Customers:", premium_customers)
print("Other Customers:", other_customers)

Output:

Premium Customers: [{'id': 1, 'name': 'Customer A', 'plan': 'Premium'}, {'id': 4, 'name': 'Customer D', 'plan': 'Premium'}, {'id': 7, 'name': 'Customer G', 'plan': 'Premium'}, {'id': 10, 'name': 'Customer J', 'plan': 'Premium'}]
Other Customers: [{'id': 2, 'name': 'Customer B', 'plan': 'Basic'}, {'id': 3, 'name': 'Customer C', 'plan': 'Standard'}, {'id': 5, 'name': 'Customer E', 'plan': 'Basic'}, {'id': 6, 'name': 'Customer F', 'plan': 'Standard'}, {'id': 8, 'name': 'Customer H', 'plan': 'Basic'}, {'id': 9, 'name': 'Customer I', 'plan': 'Standard'}]

Two new lists: premium_customers and other_customers, are created.

 

Using the numpy.array_split

The NumPy array_split is useful when you need to divide data into nearly equal parts, even when it can’t be divided evenly.

First, ensure you have numpy installed:

pip install numpy

Now, let’s apply numpy.array_split to our customer data. Assume the same JSON data as before, converted into a Python list.

import numpy as np
split_arrays = np.array_split(customers, 3)
for i, array in enumerate(split_arrays):
    print(f"Part {i+1}:", array.tolist())

Output:

Part 1: [{'id': 1, 'name': 'Customer A', 'plan': 'Premium'}, {'id': 2, 'name': 'Customer B', 'plan': 'Basic'}, {'id': 3, 'name': 'Customer C', 'plan': 'Standard'}, {'id': 4, 'name': 'Customer D', 'plan': 'Premium'}]
Part 2: [{'id': 5, 'name': 'Customer E', 'plan': 'Basic'}, {'id': 6, 'name': 'Customer F', 'plan': 'Standard'}, {'id': 7, 'name': 'Customer G', 'plan': 'Premium'}]
Part 3: [{'id': 8, 'name': 'Customer H', 'plan': 'Basic'}, {'id': 9, 'name': 'Customer I', 'plan': 'Standard'}, {'id': 10, 'name': 'Customer J', 'plan': 'Premium'}]

In this example, numpy.array_split divides the list of customers into three parts. Unlike simple list slicing, numpy.array_split can handle uneven divisions.

 

Using Iterative Splitting

The iterative splitting is useful in cases where the division logic needs to be dynamically determined.

Suppose you want to split the customer data into multiple groups where each group has a mix of different plan types. Here’s how you can do it:

def iterative_split(data, group_size):
    groups = []
    temp_group = []
    for item in data:
        temp_group.append(item)
        if len(temp_group) == group_size:
            groups.append(temp_group)
            temp_group = []
    if temp_group:
        groups.append(temp_group)
    return groups
grouped_customers = iterative_split(customers, 3)
for i, group in enumerate(grouped_customers):
    print(f"Group {i+1}:", group)

Output:

Group 1: [{'id': 1, 'name': 'Customer A', 'plan': 'Premium'}, {'id': 2, 'name': 'Customer B', 'plan': 'Basic'}, {'id': 3, 'name': 'Customer C', 'plan': 'Standard'}]
Group 2: [{'id': 4, 'name': 'Customer D', 'plan': 'Premium'}, {'id': 5, 'name': 'Customer E', 'plan': 'Basic'}, {'id': 6, 'name': 'Customer F', 'plan': 'Standard'}]
Group 3: [{'id': 7, 'name': 'Customer G', 'plan': 'Premium'}, {'id': 8, 'name': 'Customer H', 'plan': 'Basic'}, {'id': 9, 'name': 'Customer I', 'plan': 'Standard'}]
Group 4: [{'id': 10, 'name': 'Customer J', 'plan': 'Premium'}]

In this example, iterative_split is a function that takes a list of data and a group size.

 

Using Pandas GroupBy

The GroupBy function from Pandas allows for segmenting the data into groups based on some criteria and applying a function to each group independently.

First, make sure Pandas is installed:

pip install pandas

Now, let’s use GroupBy to split our customer data based on their subscription plan.

import pandas as pd

# Converting list to Pandas DataFrame
customers_df = pd.DataFrame(customers)
grouped_customers = customers_df.groupby('plan')
for plan, group in grouped_customers:
    print(f"Plan: {plan}")
    print(group)

Output:

Plan: Basic
   id        name   plan
1   2  Customer B  Basic
4   5  Customer E  Basic
7   8  Customer H  Basic

Plan: Premium
    id        name     plan
0    1  Customer A  Premium
3    4  Customer D  Premium
6    7  Customer G  Premium
9   10  Customer J  Premium

// ... additional plans ...

In this case, groupby('plan') groups the data based on the ‘plan’ column.

Each group then contains only the rows from the DataFrame that share the same plan value.

Leave a Reply

Your email address will not be published. Required fields are marked *