Convert JSON to Excel using Python Pandas

In this tutorial, you’ll learn how to convert JSON to Excel format using Pandas.

You’ll learn techniques like flattening and normalizing to handle multi-level JSON data, dealing with arrays and heterogeneous data types, and exporting to multiple Excel sheets.

 

 

Basic JSON to Excel Conversion

In this example, we have a simple JSON file containing basic customer data.

We read this JSON file using Pandas read_json method and then export it to an Excel file using to_excel() function.

JSON File Content (data.json):

[
    {"customer_id": 1, "name": "Customer A", "plan": "Basic"},
    {"customer_id": 2, "name": "Customer B", "plan": "Premium"}
]

Code:

import pandas as pd
df = pd.read_json('data.json')
df.to_excel('output.xlsx', index=False)

 

Export Nested JSON

Here, the JSON file contains nested data, such as a list of phone numbers for each customer.

We’ll use json_normalize to flatten this data before exporting it to Excel.

JSON File Content (nested_data.json):

[
    {"customer_id": 1, "name": "Customer A", "numbers": ["123-456-7890", "234-567-8901"]},
    {"customer_id": 2, "name": "Customer B", "numbers": ["345-678-9012"]}
]

Code:

import pandas as pd
from pandas import json_normalize
import json
with open('nested_data.json') as file:
    data = json.load(file)
df = json_normalize(data, 'numbers', ['customer_id', 'name'])
df.to_excel('nested_output.xlsx', index=False)

 

JSON with Multiple Levels of Nesting

In this scenario, the JSON file has multiple levels of nested data, such as customer details and their respective call records.

We’ll carefully normalize this data to a flat structure for Excel.

JSON File Content (complex_data.json):

[
    {
        "customer_id": 1, 
        "name": "Customer A", 
        "calls": [
            {"date": "2023-01-01", "duration": 30},
            {"date": "2023-01-02", "duration": 45}
        ]
    },
    {
        "customer_id": 2, 
        "name": "Customer B", 
        "calls": [
            {"date": "2023-01-03", "duration": 60}
        ]
    }
]

Code:

import pandas as pd
from pandas import json_normalize
import json
with open('complex_data.json') as file:
    data = json.load(file)
df = json_normalize(data, 'calls', ['customer_id', 'name'])
df.to_excel('complex_output.xlsx', index=False)

 

JSON with Multiple Arrays

This example deals with a JSON file where each customer has multiple arrays for different services.

We’ll split these arrays into separate DataFrames and then combine them for the final Excel file.

JSON File Content (multiple_arrays.json):

[
    {
        "customer_id": 1, 
        "name": "Customer A",
        "internet": [{"plan": "Basic", "data_limit": 100}],
        "tv": [{"package": "Standard", "channels": 50}]
    },
    {
        "customer_id": 2, 
        "name": "Customer B",
        "internet": [{"plan": "Premium", "data_limit": 200}],
        "tv": [{"package": "Premium", "channels": 100}]
    }
]

Code:

import pandas as pd
import json
with open('multiple_arrays.json') as file:
    data = json.load(file)
internet_df = pd.json_normalize(data, 'internet', ['customer_id', 'name'])
tv_df = pd.json_normalize(data, 'tv', ['customer_id', 'name'])
merged_df = pd.merge(internet_df, tv_df, on=['customer_id', 'name'])
merged_df.to_excel('multiple_arrays_output.xlsx', index=False)

 

JSON with Mixed Data Types

This example features a JSON file with mixed data types, including strings, integers, and arrays.

JSON File Content (mixed_data_types.json):

[
    {
        "customer_id": 1, 
        "name": "Customer A",
        "age": 30,
        "services": ["Internet", "TV"]
    },
    {
        "customer_id": 2, 
        "name": "Customer B",
        "age": 45,
        "services": ["Internet"]
    }
]

Code:

import pandas as pd
df = pd.read_json('mixed_data_types.json')
df['services'] = df['services'].apply(lambda x: ', '.join(x))
df.to_excel('mixed_data_types_output.xlsx', index=False)

 

Export to Multiple Excel Sheets

In this example, we process a JSON file with different categories of data, each being exported as a separate sheet in an Excel file.

JSON File Content (categories.json):

{
    "customers": [
        {"customer_id": 1, "name": "Customer A"},
        {"customer_id": 2, "name": "Customer B"}
    ],
    "transactions": [
        {"transaction_id": 100, "amount": 50, "customer_id": 1},
        {"transaction_id": 101, "amount": 75, "customer_id": 2}
    ]
}

Code:

import pandas as pd
import json
with open('categories.json') as file:
    data = json.load(file)
customers_df = pd.DataFrame(data['customers'])
transactions_df = pd.DataFrame(data['transactions'])
with pd.ExcelWriter('categories_output.xlsx') as writer:
    customers_df.to_excel(writer, sheet_name='Customers', index=False)
    transactions_df.to_excel(writer, sheet_name='Transactions', index=False)

 

Grouping Before Export

Here, the JSON data contains date keys, and we perform grouping before exporting to Excel.

JSON File Content (date_keys.json):

[
    {"date": "2023-01-01", "usage": 1.5, "customer_id": 1},
    {"date": "2023-01-01", "usage": 2.0, "customer_id": 2},
    {"date": "2023-01-02", "usage": 1.7, "customer_id": 1}
]

Code:

import pandas as pd
df = pd.read_json('date_keys.json')
group_df = df.groupby('date')['usage'].sum().reset_index()
group_df.to_excel('date_keys_group_output.xlsx', index=False)
Leave a Reply

Your email address will not be published. Required fields are marked *