Add Multiple Rows to Pandas DataFrame (Different Methods)

Adding a row to DataFrame is one of the common tasks while working with Pandas, but sometimes we want to add multiple rows to a DataFrame.

This tutorial will guide you through the process of doing just that using different methods.

Table of Contents hide

1 Using loc[]
2 Using concat()
- 2.1 Basic Concatenation Along the Row Axis
- 2.2 Handling Columns That Don’t Match Up
3 Adding Multiple Rows in a Specified Position (Between Rows)
4 Performance Comparison

Using loc[]

By assigning values to a new index using loc[], we can add rows to the bottom of our DataFrame.

Let’s first create a sample Dataframe:

import pandas as pd
data = {
    'Customer_ID': [1, 2, 3],
    'Monthly_Bill': [45.0, 55.0, 65.0],
    'Plan': ['Basic', 'Standard', 'Premium']
}
df = pd.DataFrame(data)
print(df)

Output:

   Customer_ID  Monthly_Bill      Plan
0            1          45.0     Basic
1            2          55.0  Standard
2            3          65.0   Premium

Next, let’s add more customer data using the loc[] method.

df.loc[3] = [4, 75.0, 'Deluxe']
df.loc[4] = [5, 85.0, 'Ultimate']
print(df)

Output:

   Customer_ID  Monthly_Bill      Plan
0            1          45.0     Basic
1            2          55.0  Standard
2            3          65.0   Premium
3            4          75.0    Deluxe
4            5          85.0  Ultimate

Using concat()

The pd.concat() function provides yet another robust way to add multiple rows to a DataFrame.

It is useful when you have to concatenate multiple DataFrames along a particular axis, handle unmatched columns, or even create multi-indexed DataFrames.

Basic Concatenation Along the Row Axis

To concatenate along the row axis, set the axis parameter to 0. This is also the default behavior.

data2 = {
    'Customer_ID': [10, 11],
    'Monthly_Bill': [135.0, 145.0],
    'Plan': ['National', 'Continental']
}
df2 = pd.DataFrame(data2)

# Concatenate along the row axis
result_df = pd.concat([df, df2], axis=0, ignore_index=True)
print(result_df)

Output:

   Customer_ID  Monthly_Bill         Plan
0            1          45.0        Basic
1            2          55.0     Standard
2            3          65.0      Premium
3            4          75.0       Deluxe
4            5          85.0     Ultimate
5           10         135.0     National
6           11         145.0  Continental

Handling Columns That Don’t Match Up

If the DataFrames have different columns, pd.concat() can still handle it gracefully.

# DataFrame with an extra column
data3 = {
    'Customer_ID': [12],
    'Monthly_Bill': [155.0],
    'Plan': ['Global Plus'],
    'Duration': [12]
}
df3 = pd.DataFrame(data3)

# Concatenating
result_df = pd.concat([df, df3], axis=0, ignore_index=True, sort=False)
print(result_df)

Output:

   Customer_ID  Monthly_Bill         Plan  Duration
0            1          45.0        Basic       NaN
1            2          55.0     Standard       NaN
2            3          65.0      Premium       NaN
3            4          75.0       Deluxe       NaN
4            5          85.0     Ultimate       NaN
5           12         155.0  Global Plus      12.0

Notice that for the rows from the original DataFrame where the new column “Duration” does not exist, Pandas fills it with NaN.

Adding Multiple Rows in a Specified Position (Between Rows)

You can insert rows at a specific position by slicing and concatenating DataFrames.

First, slice the DataFrame into two parts: one before the position where you want to add new rows, and one after.

# Original DataFrame
print("Original DataFrame:")
print(df)

# Slice DataFrame
df1 = df.iloc[:3]
df2 = df.iloc[3:]

Output:

Original DataFrame:
   Customer_ID  Monthly_Bill         Plan
0            1          45.0        Basic
1            2          55.0     Standard
2            3          65.0      Premium
3            4          75.0       Deluxe
4            5          85.0     Ultimate

Here, df1 contains the first three rows, and df2 contains the rest.

Now, create a DataFrame containing the rows you want to insert.

# DataFrame to insert
data_to_insert = {
    'Customer_ID': [15, 16],
    'Monthly_Bill': [100.0, 110.0],
    'Plan': ['Special', 'Special+']
}
df_to_insert = pd.DataFrame(data_to_insert)
print("DataFrame to Insert:")
print(df_to_insert)

Output:

DataFrame to Insert:
   Customer_ID  Monthly_Bill      Plan
0           15         100.0   Special
1           16         110.0  Special+

Finally, concatenate df1, the DataFrame to insert, and df2 in that order.

# Concatenate DataFrames
df = pd.concat([df1, df_to_insert, df2], ignore_index=True)
print("DataFrame After Insertion:")
print(df)

Output:

DataFrame After Insertion:
   Customer_ID  Monthly_Bill       Plan
0            1          45.0      Basic
1            2          55.0   Standard
2            3          65.0    Premium
3           15         100.0    Special
4           16         110.0   Special+
5            4          75.0     Deluxe
6            5          85.0   Ultimate

Performance Comparison

For performance comparison, we can use Python’s built-in time module to measure the elapsed time for each operation.

import pandas as pd
import numpy as np
import time
df = pd.DataFrame({'A': np.arange(1, 1000001), 'B': np.arange(1000001, 2000001)})
df_to_add = pd.DataFrame({'A': np.arange(2000001, 3000001), 'B': np.arange(3000001, 4000001)})

# Benchmark loc
start_time = time.time()
for _ in range(100):
    df.loc[len(df):len(df)+len(df_to_add)] = df_to_add
elapsed_time_loc = time.time() - start_time
print(f"Elapsed time for loc: {elapsed_time_loc:.4f} seconds")

# Benchmark concat
start_time = time.time()
for _ in range(100):
    pd.concat([df, df_to_add], ignore_index=True)
elapsed_time_concat = time.time() - start_time
print(f"Elapsed time for concat: {elapsed_time_concat:.4f} seconds")

Output:

Elapsed time for loc: 0.0394 seconds
Elapsed time for concat: 0.7206 seconds

In this example, you’ll notice that loc[] is more time-efficient than concat() for adding a large number of rows.

Mokhtar Ebrahim

Mokhtar is the founder of LikeGeeks.com. He is a seasoned technologist and accomplished author, with expertise in Linux system administration and Python development. Since 2010, Mokhtar has built an impressive career, transitioning from system administration to Python development in 2015. His work spans large corporations to freelance clients around the globe. Alongside his technical work, Mokhtar has authored some insightful books in his field. Known for his innovative solutions, meticulous attention to detail, and high-quality work, Mokhtar continually seeks new challenges within the dynamic field of technology.

Using loc[]

Using concat()

Basic Concatenation Along the Row Axis

Handling Columns That Don’t Match Up

Adding Multiple Rows in a Specified Position (Between Rows)

Performance Comparison

Related posts

Leave a Reply Cancel reply