Convert CSV to JSON using Python Pandas (Easy Tutorial)

In this tutorial, We’ll learn how to convert CSV files to JSON, including reading CSV files, converting the entire DataFrame as well as specific columns or rows to JSON.

We’ll use Pandas read_csv() and Pandas to_json() to do this.

Table of Contents hide

1 Reading CSV File
2 Converting DataFrame to JSON
3 Converting Specific Columns of the CSV Data
4 Convert Specific Rows
5 Convert CSV to Nested JSON
6 Formatting and Indenting JSON

Reading CSV File

To start converting a CSV file to JSON, the first step is to read the CSV file using Pandas.

Pandas read_csv() function loads data from a CSV file into a DataFrame.

Here’s how you do it:

import pandas as pd
csv_file = 'telecom_data.csv'
df = pd.read_csv(csv_file)
print(df.head())

Output:

   User_ID  Call_Duration  Data_Usage_GB Plan_Type
0    10001            300           2.5     Basic
1    10002            450           4.0  Standard
2    10003            200           1.0     Basic
3    10004            350           3.0  Standard
4    10005            500           5.0   Premium

Here we read the CSV file into a DataFrame and display the first five rows using the head() method.

Converting DataFrame to JSON

Once the CSV file is loaded into a DataFrame, you can convert it to JSON using Pandas to_json() method.

You can also customize the format of the JSON output based on your requirements.

Here’s how to convert the DataFrame to JSON:

json_data = df.to_json(orient='records', lines=True)
print(json_data)

Output:

{"User_ID":10001,"Call_Duration":300,"Data_Usage_GB":2.5,"Plan_Type":"Basic"}
{"User_ID":10002,"Call_Duration":450,"Data_Usage_GB":4.0,"Plan_Type":"Standard"}
{"User_ID":10003,"Call_Duration":200,"Data_Usage_GB":1.0,"Plan_Type":"Basic"}
...

The orient='records' parameter structures the JSON data as a list of records, and lines=True formats the output with each record on a separate line.

Converting Specific Columns of the CSV Data

You can select specific columns and convert them to JSON by specifying the columns:

selected_columns = df[['User_ID', 'Plan_Type']]
selected_json = selected_columns.to_json(orient='records', lines=True)
print(selected_json)

Output:

{"User_ID":10001,"Plan_Type":"Basic"}
{"User_ID":10002,"Plan_Type":"Standard"}
{"User_ID":10003,"Plan_Type":"Basic"}
...

Convert Specific Rows

You can convert only specific rows from your CSV data to JSON by filtering the rows based on the conditions you want:

# Selecting specific rows based on a condition
specific_rows = df[df['Call_Duration'] > 400]
specific_rows_json = specific_rows.to_json(orient='records', lines=True)
print(specific_rows_json)

Output:

{"User_ID":10002,"Call_Duration":450,"Data_Usage_GB":4.0,"Plan_Type":"Standard"}
{"User_ID":10005,"Call_Duration":500,"Data_Usage_GB":5.0,"Plan_Type":"Premium"}
...

In this example, the DataFrame df is filtered to include only rows where the ‘Call_Duration’ is greater than 400 minutes.

The filtered DataFrame specific_rows is then converted to a JSON string.

Convert CSV to Nested JSON

Converting CSV data to a nested JSON format is useful when dealing with complex data structures or when you need to maintain hierarchical relationships in your data.

Nested JSON structures are common in various applications, especially those involving web APIs or data storage solutions. Pandas can be used to create these nested JSON structures from a CSV file.

Here’s how you can do it:

# Grouping data by 'Plan_Type' and converting each group to JSON
nested_json = df.groupby('Plan_Type').apply(lambda x: x.to_dict(orient='records')).to_json()
print(nested_json)

Output:

{
    "Basic": [{"User_ID":10001,"Call_Duration":300,"Data_Usage_GB":2.5},
              {"User_ID":10003,"Call_Duration":200,"Data_Usage_GB":1.0}],
    "Standard": [{"User_ID":10002,"Call_Duration":450,"Data_Usage_GB":4.0},
                 {"User_ID":10004,"Call_Duration":350,"Data_Usage_GB":3.0}],
    "Premium": [{"User_ID":10005,"Call_Duration":500,"Data_Usage_GB":5.0}]
}

In this example, the DataFrame df is grouped by the ‘Plan_Type’ column. For each group, the to_dict(orient='records') method converts the group into a list of dictionaries.

Finally, the whole structure is converted into a JSON string, resulting in a nested JSON format where each plan type is a key associated with a list of records.

Formatting and Indenting JSON

Python’s JSON module allows you to format and indent your JSON data.

Here’s how to format and indent JSON data:

import json
unformatted_json = df.to_json(orient='records')

# Load the JSON string into a Python object
json_object = json.loads(unformatted_json)

# Format and indent the JSON data
formatted_json = json.dumps(json_object, indent=4)
print(formatted_json)

Output:

[
    {
        "User_ID": 10001,
        "Call_Duration": 300,
        "Data_Usage_GB": 2.5,
        "Plan_Type": "Basic"
    },
    {
        "User_ID": 10002,
        "Call_Duration": 450,
        "Data_Usage_GB": 4.0,
        "Plan_Type": "Standard"
    },
    ...
]

In this code, the DataFrame df is first converted to a JSON string. This string is then loaded into a Python object using json.loads().

Finally, json.dumps() is used to convert the Python object back into a formatted JSON string, with an indentation of 4 spaces, making it more readable and structured.

Mokhtar Ebrahim

Mokhtar is the founder of LikeGeeks.com. He is a seasoned technologist and accomplished author, with expertise in Linux system administration and Python development. Since 2010, Mokhtar has built an impressive career, transitioning from system administration to Python development in 2015. His work spans large corporations to freelance clients around the globe. Alongside his technical work, Mokhtar has authored some insightful books in his field. Known for his innovative solutions, meticulous attention to detail, and high-quality work, Mokhtar continually seeks new challenges within the dynamic field of technology.

Reading CSV File

Converting DataFrame to JSON

Converting Specific Columns of the CSV Data

Convert Specific Rows

Convert CSV to Nested JSON

Formatting and Indenting JSON

Related posts

Leave a Reply Cancel reply