Add Rows to Pandas DataFrame with Specific Index

In this tutorial, you will learn multiple methods to add a row to a Pandas DataFrame with a specific or custom index.

You’ll learn several techniques ranging from the loc property to the concat method.

 

 

Using loc[] for Index Assignment

One of the most straightforward methods to add a row to a DataFrame with a specific index is by using the loc[] property.

This property not only allows you to locate a specific row or column but also enables you to add new rows.

First, let’s create a sample DataFrame.

import pandas as pd
df = pd.DataFrame({
    'Plan': ['Basic', 'Standard', 'Premium'],
    'Monthly Cost': [5, 10, 20],
    'Data Limit (GB)': [10, 50, 100]
})
print(df)

Output:

       Plan  Monthly Cost  Data Limit (GB)
0     Basic             5               10
1  Standard            10               50
2   Premium            20              100

Now let’s add a new row with a specific index using loc[].

df.loc[3] = ['Ultra', 30, 200]
print(df)

Output:

       Plan  Monthly Cost  Data Limit (GB)
0     Basic             5               10
1  Standard            10               50
2   Premium            20              100
3     Ultra            30              200

The DataFrame now includes a new row with the index 3.

 

Adding a Row with a Specific Index Value

Sometimes you want to insert a row not at the end of the DataFrame but with a specific index value that could be non-sequential.

Let’s say you want to add a new plan but with an index value of 100. Here’s how you do it using loc[].

df.loc[100] = ['Exclusive', 50, 500]
print(df)

Output:

         Plan  Monthly Cost  Data Limit (GB)
0       Basic             5               10
1    Standard            10               50
2     Premium            20              100
100  Exclusive            50              500

The DataFrame now contains a new row with a custom index value of 100.

 

Adding Multiple Rows with Specific Indices

You can use the concat() function to add multiple rows with specific indices.

Let’s bring back our starting DataFrame:

df = pd.DataFrame({
    'Plan': ['Basic', 'Standard', 'Premium'],
    'Monthly Cost': [5, 10, 20],
    'Data Limit (GB)': [10, 50, 100]
})
print(df)

Output:

       Plan  Monthly Cost  Data Limit (GB)
0     Basic             5               10
1  Standard            10               50
2   Premium            20              100

To add multiple rows with specific indices, create a new DataFrame containing those rows and specify the indices by setting them using the index parameter, and then concatenate it with the existing one.

# Create a new DataFrame for the rows to be added
new_rows = pd.DataFrame({
    'Plan': ['Ultra', 'Exclusive'],
    'Monthly Cost': [30, 50],
    'Data Limit (GB)': [200, 500]
}, index=[3, 100])

# Concatenate the existing and new DataFrames
df = pd.concat([df, new_rows])
print(df)

Output:

         Plan  Monthly Cost  Data Limit (GB)
0       Basic             5               10
1    Standard            10               50
2     Premium            20              100
3       Ultra            30              200
100  Exclusive            50              500

The DataFrame now contains the rows ‘Ultra’ and ‘Exclusive’, each with a specified index of 3 and 100, respectively.

 

Adding Rows with Timestamp Indices

You can manually specify timestamp indices while adding rows to a DataFrame.

Let’s construct a DataFrame that contains timestamp indices.

import pandas as pd
import datetime as dt
df = pd.DataFrame({
    'Plan': ['Basic', 'Standard', 'Premium'],
    'Monthly Cost': [5, 10, 20],
    'Data Limit (GB)': [10, 50, 100]
}, index=[dt.datetime(2023, 1, 1), dt.datetime(2023, 1, 2), dt.datetime(2023, 1, 3)])
print(df)

Output:

                    Plan  Monthly Cost  Data Limit (GB)
2023-01-01          Basic             5               10
2023-01-02       Standard            10               50
2023-01-03        Premium            20              100

Here, the DataFrame indices are datetime objects representing different dates in 2023.

To insert a row with a timestamp index, use loc[] as follows:

df.loc[dt.datetime(2023, 1, 4)] = ['Ultra', 30, 200]
print(df)

Output:

                    Plan  Monthly Cost  Data Limit (GB)
2023-01-01          Basic             5               10
2023-01-02       Standard            10               50
2023-01-03        Premium            20              100
2023-01-04          Ultra            30              200

Handling Timezone-Aware Timestamps

Pandas also allows for timezone-aware datetime indices. Here’s how you can insert a row with a timezone-aware timestamp index:

# Convert DataFrame index to timezone-aware
df.index = df.index.tz_localize('UTC')

# Add a new row with a timezone-aware timestamp
df.loc[dt.datetime(2023, 1, 5, tzinfo=dt.timezone.utc)] = ['Exclusive', 50, 500]
print(df)

Output:

                                         Plan  Monthly Cost  Data Limit (GB)
2023-01-01 00:00:00+00:00               Basic             5               10
2023-01-02 00:00:00+00:00            Standard            10               50
2023-01-03 00:00:00+00:00             Premium            20              100
2023-01-04 00:00:00+00:00               Ultra            30              200
2023-01-05 00:00:00+00:00           Exclusive            50              500

The DataFrame now has a timezone-aware timestamp index for January 5, 2023, representing the ‘Exclusive’ plan.

 

Adding Rows with Specific Hierarchical Indices

Hierarchical indexing, also known as multi-indexing, allows you to have multiple levels of indices in a DataFrame.

You can add rows with hierarchical indices in a manner similar to single-level indexing but with a few additional considerations.

First, let’s initialize a DataFrame with hierarchical indices:

arrays = [
    ['Basic', 'Basic', 'Premium', 'Premium'],
    ['Monthly', 'Annual', 'Monthly', 'Annual'],
]
index = pd.MultiIndex.from_arrays(arrays, names=('Plan', 'Type'))
df = pd.DataFrame({
    'Cost': [5, 55, 20, 220],
    'Data Limit (GB)': [10, 10, 100, 100]
}, index=index)
print(df)

Output:

             Cost  Data Limit (GB)
Plan    Type                      
Basic   Monthly  5                10
        Annual   55               10
Premium Monthly  20              100
        Annual   220             100

To insert a row with a hierarchical index, use the loc[] attribute. For example, to add an “Ultra” plan that has both Monthly and Annual types:

df.loc[('Ultra', 'Monthly'), :] = [30, 200]
df.loc[('Ultra', 'Annual'), :] = [330, 200]
print(df)

Output:

             Cost  Data Limit (GB)
Plan    Type                      
Basic   Monthly  5                10
        Annual   55               10
Premium Monthly  20              100
        Annual   220             100
Ultra   Monthly  30              200
        Annual   330             200

The DataFrame now has an ‘Ultra’ plan, with both ‘Monthly’ and ‘Annual’ types, each having its own cost and data limit details.

Leave a Reply

Your email address will not be published. Required fields are marked *