Add Rows to Pandas DataFrame Based on Conditions
In this tutorial, you’ll learn how to add rows to a Pandas DataFrame based on specific conditions.
We’ll cover various scenarios, including adding rows based on simple criteria, multiple conditions with logical operators, and incorporating string matching and DateTime conditions.
Adding a Row Based on Specific Criteria
First, let’s create a sample DataFrame to work with.
import pandas as pd df = pd.DataFrame({ 'ID': [1, 2, 3, 4], 'Plan': ['Basic', 'Premium', 'Basic', 'Standard'], 'Monthly_Charge': [20, 50, 20, 30] }) print(df)
Output:
ID Plan Monthly_Charge 0 1 Basic 20 1 2 Premium 50 2 3 Basic 20 3 4 Standard 30
This output shows the DataFrame df
containing information on subscriber IDs, their plans, and monthly charges.
Now, let’s say you want to add a row where the Plan
is “Premium.”
new_row = {'ID': 5, 'Plan': 'Premium', 'Monthly_Charge': 50} # Check if 'Premium' already exists in the 'Plan' column if 'Premium' in df['Plan'].values: df.loc[len(df)] = new_row print(df)
Output:
ID Plan Monthly_Charge 0 1 Basic 20 1 2 Premium 50 2 3 Basic 20 3 4 Standard 30 4 5 Premium 50
Multiple Conditions for Row Addition
Let’s see how you can add a row based on multiple conditions, using logical operators like and
, or
, and not
.
To begin, we’ll work with the same sample DataFrame from the previous example.
Using and Operator
Let’s say you want to add a new row to your DataFrame only if it already contains at least one ‘Premium’ plan with a monthly charge of 50.
new_row = {'ID': 6, 'Plan': 'Premium', 'Monthly_Charge': 50} # Add the new row based on multiple conditions if ('Premium' in df['Plan'].values) and (50 in df['Monthly_Charge'].values): df.loc[len(df)] = new_row print(df)
Output:
ID Plan Monthly_Charge 0 1 Basic 20 1 2 Premium 50 2 3 Basic 20 3 4 Standard 30 4 5 Premium 50 5 6 Premium 50
Using or Operator
You can use the or
operator if you want to add a row when either of the conditions is met. Let’s try to add a row if the DataFrame contains either a ‘Premium’ plan or a monthly charge of 20.
new_row = {'ID': 7, 'Plan': 'Basic', 'Monthly_Charge': 25} # Add the new row based on multiple conditions if ('Premium' in df['Plan'].values) or (20 in df['Monthly_Charge'].values): df.loc[len(df)] = new_row print(df)
Output:
ID Plan Monthly_Charge 0 1 Basic 20 1 2 Premium 50 2 3 Basic 20 3 4 Standard 30 4 5 Premium 50 5 6 Premium 50 6 7 Basic 25
Nested Conditional Statements to Add Rows
Imagine a situation where you want to add a new row with a ‘Premium’ plan only if the DataFrame already contains a ‘Premium’ plan with a monthly charge of 50, or a ‘Basic’ plan with a monthly charge of 20.
new_row = {'ID': 8, 'Plan': 'Premium', 'Monthly_Charge': 55} # Add the new row based on nested conditions if 'Premium' in df['Plan'].values: if 50 in df[df['Plan'] == 'Premium']['Monthly_Charge'].values: df.loc[len(df)] = new_row elif 'Basic' in df['Plan'].values: if 20 in df[df['Plan'] == 'Basic']['Monthly_Charge'].values: df.loc[len(df)] = new_row print(df)
Output:
ID Plan Monthly_Charge 0 1 Basic 20 1 2 Premium 50 2 3 Basic 20 3 4 Standard 30 4 5 Premium 50 5 6 Premium 50 6 7 Basic 25 7 8 Premium 55
Add Row Based on DateTime Condition
When you’re working with time-series data or timestamps, applying conditions based on DateTime objects becomes vital.
First, add a ‘StartDate’ column to the DataFrame with some sample DateTime data.
df['StartDate'] = pd.to_datetime(['2022-01-01', '2022-02-01', '2022-03-01', '2022-01-15', '2022-02-20', '2022-01-25', '2022-01-30', '2022-02-15' ]) print(df)
Output:
ID Plan Monthly_Charge StartDate 0 1 Basic 20 2022-01-01 1 2 Premium 50 2022-02-01 2 3 Basic 20 2022-03-01 3 4 Standard 30 2022-01-15 4 5 Premium 50 2022-02-20 5 6 Premium 50 2022-01-25 6 7 Basic 25 2022-01-30 7 8 Premium 55 2022-02-15
The ‘StartDate’ column is now part of your DataFrame.
Suppose you want to add a new row only if there are existing entries with a ‘StartDate’ in January 2022. You can do this as follows:
from datetime import datetime new_row = {'ID': 22, 'Plan': 'Ultra', 'Monthly_Charge': 90, 'StartDate': datetime.strptime('2022-04-15', '%Y-%m-%d')} if any(df['StartDate'].dt.month == 1) and any(df['StartDate'].dt.year == 2022): df.loc[len(df)] = new_row print(df)
Output:
ID Plan Monthly_Charge StartDate 0 1 Basic 20 2022-01-01 1 2 Premium 50 2022-02-01 2 3 Basic 20 2022-03-01 3 4 Standard 30 2022-01-15 4 5 Premium 50 2022-02-20 5 6 Premium 50 2022-01-25 6 7 Basic 25 2022-01-30 7 8 Premium 55 2022-02-15 8 22 Ultra 90 2022-04-15
Adding Rows Based on String Matching Conditions
Sometimes, you might want to add rows based on string matching conditions.
For instance, you may want to append a new row only if a particular substring appears in a text column of your DataFrame.
Let’s see how this works by adding a new column called ‘Description’ to our DataFrame.
First, we’ll add a ‘Description’ column containing some textual data to our existing DataFrame.
df['Description'] = ['Low tier', 'Top tier', 'Low tier', 'Mid tier', 'Top tier', 'Top tier', 'Low tier', 'Top tier', 'Superior tier'] print(df)
Output:
ID Plan Monthly_Charge StartDate Description 0 1 Basic 20 2022-01-01 Low tier 1 2 Premium 50 2022-02-01 Top tier 2 3 Basic 20 2022-03-01 Low tier 3 4 Standard 30 2022-01-15 Mid tier 4 5 Premium 50 2022-02-20 Top tier 5 6 Premium 50 2022-01-25 Top tier 6 7 Basic 25 2022-01-30 Low tier 7 8 Premium 55 2022-02-15 Top tier 8 22 Ultra 90 2022-04-15 Superior tier
Let’s say you want to add a new row only if there are any descriptions that contain the word “Superior.”
new_row = {'ID': 23, 'Plan': 'Elite', 'Monthly_Charge': 100, 'StartDate': datetime.strptime('2022-05-01', '%Y-%m-%d'), 'Description': 'Elite tier'} # Add row based on string matching if df['Description'].str.contains('Superior').any(): df.loc[len(df)] = new_row print(df)
Output:
ID Plan Monthly_Charge StartDate Description 0 1 Basic 20 2022-01-01 Low tier 1 2 Premium 50 2022-02-01 Top tier 2 3 Basic 20 2022-03-01 Low tier 3 4 Standard 30 2022-01-15 Mid tier 4 5 Premium 50 2022-02-20 Top tier 5 6 Premium 50 2022-01-25 Top tier 6 7 Basic 25 2022-01-30 Low tier 7 8 Premium 55 2022-02-15 Top tier 8 22 Ultra 90 2022-04-15 Superior tier 9 23 Elite 100 2022-05-01 Elite tier
Add Row Based on Presence of NaN Values
First, we will add some NaN values to our ‘Monthly_Charge’ column to simulate a typical data issue.
import numpy as np df.loc[[2, 5, 9], 'Monthly_Charge'] = np.nan print(df)
Output:
ID Plan Monthly_Charge StartDate Description 0 1 Basic 20.0 2022-01-01 Low tier 1 2 Premium 50.0 2022-02-01 Top tier 2 3 Basic NaN 2022-03-01 Low tier 3 4 Standard 30.0 2022-01-15 Mid tier 4 5 Premium 50.0 2022-02-20 Top tier 5 6 Premium NaN 2022-01-25 Top tier 6 7 Basic 25.0 2022-01-30 Low tier 7 8 Premium 55.0 2022-02-15 Top tier 8 22 Ultra 90.0 2022-04-15 Superior tier 9 23 Elite NaN 2022-05-01 Elite tier
Now, let’s say you want to add a row only if there are any NaN values in the ‘Monthly_Charge’ column. Here’s how you can do it:
new_row = {'ID': 24, 'Plan': 'Basic', 'Monthly_Charge': 15, 'StartDate': datetime.strptime('2022-05-10', '%Y-%m-%d'), 'Description': 'Lowest tier'} if df['Monthly_Charge'].isna().any(): df.loc[len(df)] = new_row print(df)
Output:
ID Plan Monthly_Charge StartDate Description 0 1 Basic 20.0 2022-01-01 Low tier 1 2 Premium 50.0 2022-02-01 Top tier 2 3 Basic NaN 2022-03-01 Low tier 3 4 Standard 30.0 2022-01-15 Mid tier 4 5 Premium 50.0 2022-02-20 Top tier 5 6 Premium NaN 2022-01-25 Top tier 6 7 Basic 25.0 2022-01-30 Low tier 7 8 Premium 55.0 2022-02-15 Top tier 8 22 Ultra 90.0 2022-04-15 Superior tier 9 23 Elite NaN 2022-05-01 Elite tier 10 24 Basic 15.0 2022-05-10 Lowest tier
Add Row Based on Previous Row Value
Consider a situation where you want to add a new row only if the ‘Monthly_Charge’ for the previous row is greater than a certain threshold. Here’s an example:
new_row = {'ID': 25, 'Plan': 'Standard', 'Monthly_Charge': 35} if df.iloc[-1]['Monthly_Charge'] >= 15: df.loc[len(df)] = new_row print(df)
Output:
ID Plan Monthly_Charge StartDate Description 0 1 Basic 20.0 2022-01-01 Low tier 1 2 Premium 50.0 2022-02-01 Top tier 2 3 Basic NaN 2022-03-01 Low tier 3 4 Standard 30.0 2022-01-15 Mid tier 4 5 Premium 50.0 2022-02-20 Top tier 5 6 Premium NaN 2022-01-25 Top tier 6 7 Basic 25.0 2022-01-30 Low tier 7 8 Premium 55.0 2022-02-15 Top tier 8 22 Ultra 90.0 2022-04-15 Superior tier 9 23 Elite NaN 2022-05-01 Elite tier 10 24 Basic 15.0 2022-05-10 Lowest tier 11 25 Standard 35.0 NaT NaN
Mokhtar is the founder of LikeGeeks.com. He is a seasoned technologist and accomplished author, with expertise in Linux system administration and Python development. Since 2010, Mokhtar has built an impressive career, transitioning from system administration to Python development in 2015. His work spans large corporations to freelance clients around the globe. Alongside his technical work, Mokhtar has authored some insightful books in his field. Known for his innovative solutions, meticulous attention to detail, and high-quality work, Mokhtar continually seeks new challenges within the dynamic field of technology.