Add Empty Rows to Pandas DataFrame in Python

You may find yourself in a case where you need to add an empty row to your Pandas DataFrame.

Whether you’re preparing your data for analysis or need to insert placeholders for future data, this tutorial will show you the multiple methods to do this task.

 

 

Using loc[]

You can use loc[] property to add an empty row to your DataFrame.

First, let’s create a sample DataFrame.

import pandas as pd
data = {'ID': [1, 2, 3],
        'Name': ['Emma', 'Sophia', 'Liam'],
        'Age': [28, 22, 36]}
df = pd.DataFrame(data)
print(df)

Output:

   ID    Name  Age
0   1    Emma   28
1   2  Sophia   22
2   3    Liam   36

Now, let’s add an empty row using loc[].

new_index = max(df.index) + 1
df.loc[new_index] = ''
print(df)

Output:

     ID    Name   Age
0   1.0    Emma  28.0
1   2.0  Sophia  22.0
2   3.0    Liam  36.0
3  

We calculate the new_index by taking the maximum of the existing indices and adding one to it.

Then, we use loc[] to insert a row at that new index, setting its value to empty string.

 

Using concat

The Pandas concat function offers another way to add an empty row to your DataFrame.

Let’s get started by creating the original DataFrame:

data = {'ID': [1, 2, 3],
        'Name': ['Emma', 'Sophia', 'Liam'],
        'Age': [28, 22, 36]}
df = pd.DataFrame(data)
print(df)

Output:

   ID    Name  Age
0   1    Emma   28
1   2  Sophia   22
2   3    Liam   36

To add an empty row, let’s use the concat function:

empty_rows = 2

# Create empty DataFrame with empty rows
empty_data = {col: ['' for _ in range(empty_rows)] for col in df.columns}
empty_df = pd.DataFrame(empty_data)

# Concatenate the original DataFrame with the empty DataFrame
df = pd.concat([df, empty_df], ignore_index=True)
print(df)

Output:

  ID    Name Age
0  1    Emma  28
1  2  Sophia  22
2  3    Liam  36
3               
4               

The empty_data dictionary is created with the same columns as the original DataFrame.

Each column is initialized with a list of empty strings ('') for the desired number of empty rows.

Then, a DataFrame (empty_df) is created using the empty_data dictionary.

Finally, the pd.concat() function is used to concatenate the original DataFrame (df) with the empty DataFrame (empty_df).

 

Adding an Empty Row at the Top of a DataFrame

You can do this by shifting the existing rows downward and then concatenating an empty DataFrame with the original one.

First, let’s revisit our example DataFrame:

data = {'ID': [1, 2, 3],
        'Name': ['Emma', 'Sophia', 'Liam'],
        'Age': [28, 22, 36]}
df = pd.DataFrame(data)
print(df)

Output:

   ID    Name  Age
0   1    Emma   28
1   2  Sophia   22
2   3    Liam   36

Now, to add an empty row at the beginning, follow these steps:

empty_rows = 2
empty_data = {col: ['' for _ in range(empty_rows)] for col in df.columns}
empty_df = pd.DataFrame(empty_data)

# Concatenate the empty DataFrame with the original DataFrame
df = pd.concat([empty_df, df])
df.reset_index(drop=True, inplace=True)
print(df)

Output:

  ID    Name Age
0               
1               
2  1    Emma  28
3  2  Sophia  22
4  3    Liam  36

In this method, you create an empty DataFrame and concatenate it with the original DataFrame.

Because the empty DataFrame is the first in the concatenation operation, it ends up as the first row in the resulting DataFrame.

 

Adding Empty Rows at a Specific Position

You can achieve this by slicing the DataFrame into two parts at the position where you want to insert the empty row and then rejoining them using pd.concat.

Here is the initial DataFrame for context:

data = {'ID': [1, 2, 3],
        'Name': ['Emma', 'Sophia', 'Liam'],
        'Age': [28, 22, 36]}
df = pd.DataFrame(data)
print(df)

Output:

   ID    Name  Age
0   1    Emma   28
1   2  Sophia   22
2   3    Liam   36

Now, let’s say you want to insert an empty row at the second position (index 1).

empty_rows = 2
empty_data = {col: ['' for _ in range(empty_rows)] for col in df.columns}
empty_df = pd.DataFrame(empty_data)

# Slice the original DataFrame into two parts
df1 = df.iloc[:1]
df2 = df.iloc[1:]

# Concatenate the three DataFrames: df1, empty_df, and df2
df = pd.concat([df1, empty_df, df2])
df.reset_index(drop=True, inplace=True)
print(df)

Output:

  ID    Name Age
0  1    Emma  28
1               
2               
3  2  Sophia  22
4  3    Liam  36

To insert the empty row, the DataFrame is sliced into two parts: df1 and df2.

An empty DataFrame (empty_df) is then concatenated between these two slices. Finally, the index is reset for a seamless DataFrame.

 

Preserving the Data Types when Adding Empty Rows

Adding an empty row often sets the column values to NaN, which are float numbers by default.

This may change the data type of the entire column.

As usual, let’s start with the original DataFrame for reference:

data = {'ID': [1, 2, 3],
        'Name': ['Emma', 'Sophia', 'Liam'],
        'Age': [28, 22, 36]}
df = pd.DataFrame(data)
print(df.dtypes)

Output:

ID       int64
Name    object
Age      int64
dtype: object

You have columns of types int64 and object.

Now, if you were to add an empty row, you’d notice a type change:

# Adding an empty row using loc
df.loc[len(df.index)] = None
print(df.dtypes)

Output:

ID      float64
Name     object
Age     float64
dtype: object

As you can see, the data types for the ‘ID’ and ‘Age’ columns have changed to float64.

Using astype to Preserve Data Types

You can use the astype method to enforce the original data types after adding empty rows:

df = df.astype({'ID': 'Int64', 'Name': 'object', 'Age': 'Int64'})
print(df.dtypes)

Output:

ID      Int64
Name    object
Age     Int64
dtype: object

The capital ‘I’ in ‘Int64’ allows for integer columns with NaN values, maintaining the integrity of your data.

 

Using Placeholders in Empty Rows

Sometimes, instead of leaving the new rows completely empty, you might want to insert placeholder values or default values.

Adding a Row with Default Values

First, let’s create the original DataFrame:

data = {'ID': [1, 2, 3],
        'Name': ['Emma', 'Sophia', 'Liam'],
        'Age': [28, 22, 36]}
df = pd.DataFrame(data)
print(df)

Output:

   ID    Name  Age
0   1    Emma   28
1   2  Sophia   22
2   3    Liam   36

To add a row with default values, you can do the following:

df.loc[len(df.index)] = [0, 'Unknown', 0]
print(df)

Output:

   ID     Name  Age
0   1     Emma   28
1   2   Sophia   22
2   3     Liam   36
3   0  Unknown    0

In this example, you’re adding a row with default values like 0 for the ‘ID’ and ‘Age’ columns and ‘Unknown’ for the ‘Name’ column.

Leave a Reply

Your email address will not be published. Required fields are marked *