Add Rows to Pandas DataFrame in Loop
In this tutorial, you’ll learn different methods to add rows to Pandas DataFrame using loops.
We’ll use methods such as: concat()
, loc[]
, iloc[]
, iterrows()
, and from_records()
.
Using concat
Let’s start with a sample DataFrame and assume we have multiple batches of new customers to add:
data = {'CustomerID': [1, 2, 3], 'Name': ['John', 'Emily', 'Michael'], 'Plan': ['Basic', 'Premium', 'Standard'], 'Balance': [50, 120, 80]} df = pd.DataFrame(data) batch_1 = pd.DataFrame({'CustomerID': [4, 5], 'Name': ['Sarah', 'Alex'], 'Plan': ['Basic', 'Premium'], 'Balance': [60, 100]}) batch_2 = pd.DataFrame({'CustomerID': [6, 7], 'Name': ['Daniel', 'Emma'], 'Plan': ['Standard', 'Basic'], 'Balance': [70, 55]}) batches = [batch_1, batch_2] print("Initial DataFrame:") print(df)
Output:
Initial DataFrame: CustomerID Name Plan Balance 0 1 John Basic 50 1 2 Emily Premium 120 2 3 Michael Standard 80
Now, you can use a loop to add these batches to the existing DataFrame using concat
:
for batch in batches: df = pd.concat([df, batch], ignore_index=True) print("DataFrame after adding batches:") print(df)
Output:
DataFrame after adding batches: CustomerID Name Plan Balance 0 1 John Basic 50 1 2 Emily Premium 120 2 3 Michael Standard 80 3 4 Sarah Basic 60 4 5 Alex Premium 100 5 6 Daniel Standard 70 6 7 Emma Basic 55
Adding Rows using loc and iloc in a Loop
These methods are useful for modifying existing rows or inserting rows in the middle of a DataFrame.
Using loc
Let’s start with the initial DataFrame and a list of new customers:
data = {'CustomerID': [1, 2, 3], 'Name': ['John', 'Emily', 'Michael'], 'Plan': ['Basic', 'Premium', 'Standard'], 'Balance': [50, 120, 80]} df = pd.DataFrame(data) # List of new customers as dictionaries new_customers = [ {'CustomerID': 4, 'Name': 'Sarah', 'Plan': 'Basic', 'Balance': 60}, {'CustomerID': 5, 'Name': 'Alex', 'Plan': 'Premium', 'Balance': 100} ] print("Initial DataFrame:") print(df)
Output:
Initial DataFrame: CustomerID Name Plan Balance 0 1 John Basic 50 1 2 Emily Premium 120 2 3 Michael Standard 80
Now you can use the loc
property within a for loop to add each new customer to the DataFrame:
for idx, customer in enumerate(new_customers, start=len(df)): df.loc[idx] = [customer['CustomerID'], customer['Name'], customer['Plan'], customer['Balance']] print("DataFrame after adding rows using loc:") print(df)
Output:
DataFrame after adding rows using loc: CustomerID Name Plan Balance 0 1 John Basic 50 1 2 Emily Premium 120 2 3 Michael Standard 80 3 4 Sarah Basic 60 4 5 Alex Premium 100
Using iloc
To add new rows using iloc
, you’ll first need to increase the DataFrame’s index size.
Then you can use iloc
to directly place data into the new row positions:
# Number of new rows to add num_new_rows = 3 # Increase DataFrame index size df_length = len(df) df = df.reindex(df.index.tolist() + list(range(df_length, df_length + num_new_rows))) for i in range(num_new_rows): new_row_index = df_length + i df.iloc[new_row_index] = [new_row_index + 1, f'Customer{new_row_index + 1}', 'Basic', 50 + new_row_index] print("DataFrame after adding rows using iloc in a loop:") print(df)
Output:
DataFrame after adding rows using iloc in a loop: CustomerID Name Plan Balance 0 1.0 John Basic 50.0 1 2.0 Emily Premium 120.0 2 3.0 Michael Standard 80.0 3 4.0 Customer4 Basic 53.0 4 5.0 Customer5 Basic 54.0 5 6.0 Customer6 Basic 55.0
Using iterrows()
Using the iterrows()
function provides yet another approach to loop through each row of a DataFrame to add new rows.
The function returns an iterator resulting an index and row data as pairs.
This method is useful when you need to consider the index while manipulating rows.
Our initial DataFrame:
data = {'CustomerID': [1, 2, 3], 'Name': ['John', 'Emily', 'Michael'], 'Plan': ['Basic', 'Premium', 'Standard'], 'Balance': [50, 120, 80]} df = pd.DataFrame(data) print("Initial DataFrame:") print(df)
Output:
Initial DataFrame: CustomerID Name Plan Balance 0 1 John Basic 50 1 2 Emily Premium 120 2 3 Michael Standard 80
Let’s create new rows where each new row’s balance is the corresponding original row’s balance minus a service charge of 5.
Here’s how you can use iterrows()
to do this:
for index, row in df.iterrows(): new_row = row.copy() new_row['Balance'] = row['Balance'] - 5 # Apply a service charge df.loc[len(df)] = new_row df.reset_index(drop=True, inplace=True) print(df)
Output:
CustomerID Name Plan Balance 0 1 John Basic 50 1 2 Emily Premium 120 2 3 Michael Standard 80 3 1 John Basic 45 4 2 Emily Premium 115 5 3 Michael Standard 75
Using DataFrame.from_records
You can use DataFrame.from_records
method to add multiple rows to a DataFrame that are created by a loop.
Here, we’ll dynamically create a list of dictionaries and then convert it into a DataFrame.
Let’s start with the initial DataFrame:
data = {'CustomerID': [1, 2, 3], 'Name': ['John', 'Emily', 'Michael'], 'Plan': ['Basic', 'Premium', 'Standard'], 'Balance': [50, 120, 80]} df = pd.DataFrame(data) print("Initial DataFrame:") print(df)
Output:
Initial DataFrame: CustomerID Name Plan Balance 0 1 John Basic 50 1 2 Emily Premium 120 2 3 Michael Standard 80
Now, let’s suppose you want to add new customer rows dynamically, perhaps based on some condition or external data source. For demonstration, we’ll add 3 new rows in a for loop:
new_rows_list = [] # Loop to create new rows for i in range(4, 7): new_row = {'CustomerID': i, 'Name': f'Customer{i}', 'Plan': 'Basic', 'Balance': 60 + i} new_rows_list.append(new_row) new_rows_df = pd.DataFrame.from_records(new_rows_list) print("New rows as DataFrame:") print(new_rows_df)
Output:
New rows as DataFrame: CustomerID Name Plan Balance 0 4 Customer4 Basic 64 1 5 Customer5 Basic 65 2 6 Customer6 Basic 66
Finally, you can concatenate this new DataFrame with the original one:
# Merge the new rows DataFrame with the original DataFrame df = pd.concat([df, new_rows_df], ignore_index=True) print("DataFrame after efficient append using DataFrame.from_records and a for loop:") print(df)
Output:
DataFrame after efficient append using DataFrame.from_records and a for loop: CustomerID Name Plan Balance 0 1 John Basic 50 1 2 Emily Premium 120 2 3 Michael Standard 80 3 4 Customer4 Basic 64 4 5 Customer5 Basic 65 5 6 Customer6 Basic 66
Performance Comparison
Let’s start by creating a sample DataFrame with 10,000 rows. We’ll time each method to append an additional 1,000 rows.
import pandas as pd import time data = {'CustomerID': list(range(1, 10001)), 'Name': [f'Customer{i}' for i in range(1, 10001)], 'Plan': ['Basic'] * 10000, 'Balance': [50] * 10000} df = pd.DataFrame(data)
Timing concat() Method
new_rows = pd.DataFrame({'CustomerID': list(range(11001, 12001)), 'Name': [f'Customer{i}' for i in range(11001, 12001)], 'Plan': ['Basic'] * 1000, 'Balance': [50] * 1000}) start_time = time.time() df = pd.concat([df, new_rows], ignore_index=True) end_time = time.time() print(f"Time taken using concat(): {end_time - start_time} seconds")
Timing loc with For Loop
start_time = time.time() for i in range(12001, 13001): df.loc[len(df.index)] = [i, f'Customer{i}', 'Basic', 50] end_time = time.time() print(f"Time taken using loc with for loop: {end_time - start_time} seconds")
Timing DataFrame.from_records with For Loop
new_rows_list = [] for i in range(13001, 14001): new_row = {'CustomerID': i, 'Name': f'Customer{i}', 'Plan': 'Basic', 'Balance': 50} new_rows_list.append(new_row) new_rows_df = pd.DataFrame.from_records(new_rows_list) start_time = time.time() df = pd.concat([df, new_rows_df], ignore_index=True) end_time = time.time() print(f"Time taken using DataFrame.from_records with for loop: {end_time - start_time} seconds")
Output:
Time taken using concat(): 0.0020020008087158203 seconds Time taken using loc with for loop: 1.9779589176177979 seconds Time taken using DataFrame.from_records with for loop: 0.002157926559448242 seconds
As you can see, concat()
and DataFrame.from_records()
are faster to add rows in a loop.
Mokhtar is the founder of LikeGeeks.com. He is a seasoned technologist and accomplished author, with expertise in Linux system administration and Python development. Since 2010, Mokhtar has built an impressive career, transitioning from system administration to Python development in 2015. His work spans large corporations to freelance clients around the globe. Alongside his technical work, Mokhtar has authored some insightful books in his field. Known for his innovative solutions, meticulous attention to detail, and high-quality work, Mokhtar continually seeks new challenges within the dynamic field of technology.