Add Row with Average in Pandas DataFrame

In this tutorial, we’ll explore step-by-step methods to add a row with average values in a Pandas DataFrame.

You will leverage key Pandas functions like mean() and groupby() to do this effectively.

 

 

Calculating the Average using mean()

Let’s start with the simplest and most straightforward approach—using the mean() function.

import pandas as pd
data = {'CustomerID': [1, 2, 3],
        'MonthlyCharges': [70.20, 45.30, 89.90],
        'TotalCharges': [492.15, 1450.60, 789.25]}
df = pd.DataFrame(data)

# Calculate the average using mean()
average_values = df.mean()
print(average_values)

Output:

CustomerID          2.000000
MonthlyCharges     68.466667
TotalCharges      910.666667
dtype: float64

The mean() function calculates the average for each numerical column and returns a Pandas Series with these average values.

 

Add Average Row Using loc[]

You can use the loc[] function to add a row with average values.
First, calculate the average of numerical columns:

average_values = df.select_dtypes(include=['number']).mean()

Now add this row:

df.loc['Average'] = average_values
print(df)

Output:

         CustomerID  MonthlyCharges  TotalCharges
0               1.0       70.200000    492.150000
1               2.0       45.300000   1450.600000
2               3.0       89.900000    789.250000
Average         2.0       68.466667    910.666667

 

Add Average Row Using concat()

You can use the concat() function if you want to add a row with average values to your DataFrame.

First, let’s create a DataFrame containing the average values, which will then be concatenated to the original DataFrame.

# Calculate the average values for the DataFrame
average_values = df.select_dtypes(include=['number']).mean()

# Convert the Pandas Series to a DataFrame
average_df = pd.DataFrame([average_values])
average_df['CustomerID'] = 'Average'
print(average_df)

Output:

  CustomerID  MonthlyCharges  TotalCharges
0    Average       68.466667    910.666667

Here, we’ve transformed the average values into a single-row DataFrame.

Now that you have a DataFrame with the average values, you can concatenate it with the original DataFrame.

concatenated_df = pd.concat([df, average_df], ignore_index=True)
print(concatenated_df)

Output:

  CustomerID  MonthlyCharges  TotalCharges
0        1.0       70.200000    492.150000
1        2.0       45.300000   1450.600000
2        3.0       89.900000    789.250000
3        2.0       68.466667    910.666667
4    Average       68.466667    910.666667

 

Adding Multiple Average Rows Based on Grouping

Pandas provides the groupby() function to group your data, and you can then append these averages back to the original DataFrame for better insights.

First, let’s group the data based on CustomerID.

grouped_df = df.groupby('CustomerID')
print(grouped_df.size())

Output:

CustomerID
1.0    1
2.0    2
3.0    1
dtype: int64

The groupby() function provides us with groups based on unique CustomerID.

Now, you can calculate the average for each group:

# Calculate the mean for each group
group_average = grouped_df.mean()
print(group_average)

Output:

            MonthlyCharges  TotalCharges
CustomerID                              
1.0              70.200000    492.150000
2.0              56.883333   1180.633333
3.0              89.900000    789.250000

Here, the mean values for MonthlyCharges and TotalCharges are calculated for each CustomerID group.

Leave a Reply

Your email address will not be published. Required fields are marked *