Using Pandas DataFrame iloc Property for Index Based Access
The iloc
property in the Pandas library stands for “integer-location” and provides integer-based indexing for selection by position.
This means you can select rows and columns in a DataFrame by their integer position.
In this tutorial, we’ll cover various aspects of using iloc
, including selecting single rows, multiple rows, specific columns, and even individual cells. We’ll also delve into advanced techniques like boolean indexing.
- 1 Select a Single Row by Integer Index
- 2 Select Multiple Rows Using a List of Integer Indices
- 3 Slice Rows Using a Range of Integer Indices
- 4 Select a Single Column by Integer Index
- 5 Select Multiple Columns Using a List of Integer Indices
- 6 Slice Columns Using a Range of Integer Indices
- 7 Select a Single Cell by Specifying Row and Column Indices
- 8 Select Rows for Specific Columns Using Lists of Indices
- 9 Slice Rows and Columns Using Ranges of Integer Indices
- 10 Set the Value of a Specific Cell
- 11 Set Values for a Row or a Set of Rows
- 12 Set Values for a Column or a Set of Columns
- 13 Set Values for a Range of Cells (Both Rows and Columns)
- 14 Boolean Indexing (Use Boolean Arrays/Masks)
- 15 Error Handling and Common Pitfalls
- 16 Resource
Select a Single Row by Integer Index
You can get an entire row of data by providing the integer index of the row you want to extract.
import pandas as pd data = { 'Name': ['John', 'Doe', 'Jane', 'Smith'], 'Age': [28, 34, 22, 45], 'City': ['New York', 'Los Angeles', 'Chicago', 'Houston'] } df = pd.DataFrame(data) # Select the second row selected_row = df.iloc[1] print(selected_row)
Output:
Name Doe Age 34 City Los Angeles Name: 1, dtype: object
Select Multiple Rows Using a List of Integer Indices
Sometimes, you might want to retrieve multiple rows at once based on their positions. With iloc
, you can do this by providing a list of integer indices.
This method returns a new DataFrame containing only the rows at the specified positions.
# Select the first and third rows selected_rows = df.iloc[[0, 2]] print(selected_rows)
Output:
Name Age City 0 John 28 New York 2 Jane 22 Chicago
By passing a list containing 0
and 2
to iloc
, we’ve fetched the first and third rows of our DataFrame.
Slice Rows Using a Range of Integer Indices
iloc
also allows you to employ range-based indexing to slice rows.
You specify a start index and an end index, and optionally a step. This returns a range of consecutive rows from the DataFrame.
# Select the first three rows sliced_rows = df.iloc[0:3] print(sliced_rows)
Output:
Name Age City 0 John 28 New York 1 Doe 34 Los Angeles 2 Jane 22 Chicago
In the provided example, we start from the 0th index (inclusive) and go up to but not including the 3rd index.
Select a Single Column by Integer Index
Columns are the second axis (axis=1). Utilizing iloc
, you can select individual columns by their integer index.
Note that when you extract a single column using iloc
, the result is a Series object, not a DataFrame.
# Select the first column selected_column = df.iloc[:, 0] print(selected_column)
Output:
0 John 1 Doe 2 Jane 3 Smith Name: Name, dtype: object
The colon :
in the row position means “all rows”, and the 0
following the comma specifies the first column.
Select Multiple Columns Using a List of Integer Indices
Just as you can select multiple rows with a list of indices, iloc
supports the selection of multiple columns by providing a list of integer indices for the columns.
# Select the first and third columns selected_columns = df.iloc[:, [0, 2]]
Output:
Name City 0 John New York 1 Doe Los Angeles 2 Jane Chicago 3 Smith Houston
In this example, we target the first and third columns using the list [0, 2]
in the column position.
Slice Columns Using a Range of Integer Indices
You can use iloc
combined with range-based indexing to select a continuous set of columns based on their position:
# Select the first two columns sliced_columns = df.iloc[:, 0:2] print(sliced_columns)
Output:
Name Age 0 John 28 1 Doe 34 2 Jane 22 3 Smith 45
Here, we’ve utilized the range 0:2
within the column’s position in iloc
.
This selects columns starting from the 0th index (inclusive) up to, but not including, the 2nd index.
Select a Single Cell by Specifying Row and Column Indices
Using iloc
, you can pinpoint and extract the value of a single cell by specifying both its row and column integer indices.
# Select the cell from the second row and first column cell_value = df.iloc[1, 0] print(cell_value)
Output:
Doe
In this code snippet, we’ve targeted the cell in the second row and first column using iloc[1, 0]
. The result is the name “Doe”.
Select Rows for Specific Columns Using Lists of Indices
You can select multiple rows and specific columns simultaneously by providing lists of integer indices for both dimensions.
# Select the first and third rows for the first and third columns subset = df.iloc[[0, 2], [0, 2]] print(subset)
Output:
Name City 0 John New York 2 Jane Chicago
In the example provided, we’ve specified a list for both row and column indices: [0, 2]
.
This fetches the first and third rows, and within those rows, only the first and third columns.
Slice Rows and Columns Using Ranges of Integer Indices
With iloc
, you can slice rows and columns using ranges, providing a sub-DataFrame as the output.
# Select the first three rows and first two columns subset = df.iloc[0:3, 0:2]
Output:
Name Age 0 John 28 1 Doe 34 2 Jane 22
In the demonstrated code, we’ve combined two ranges: 0:3
for rows and 0:2
for columns.
This selects the first three rows and the first two columns.
Set the Value of a Specific Cell
Using iloc
, you can set the value for any specific cell by specifying its row and column indices.
# Set the value of the cell in the second row and first column to 'Alex' df.iloc[1, 0] = 'Alex' print(df)
Output:
Name Age City 0 John 28 New York 1 Alex 34 Los Angeles 2 Jane 22 Chicago 3 Smith 45 Houston
Set Values for a Row or a Set of Rows
The iloc
property, you can update the values for an entire row or a set of rows:
# Set values for the third row df.iloc[2] = ['Ella', 30, 'Seattle'] # Set values for the first and fourth rows df.iloc[[0, 3]] = [['Bob', 29, 'Boston'], ['Lucas', 47, 'Miami']] print(df)
Output:
Name Age City 0 Bob 29 Boston 1 Alex 34 Los Angeles 2 Ella 30 Seattle 3 Lucas 47 Miami
In the example, we first set new values for the third row using df.iloc[2] = ['Ella', 30, 'Seattle']
, updating the data for “Jane”.
Then, we target the first and fourth rows, assigning new values simultaneously.
Set Values for a Column or a Set of Columns
The iloc
property allows you to update an entire column or several columns at once:
# Set values for the 'Age' column df.iloc[:, 1] = [35, 36, 31, 48] # Set values for the 'Name' and 'City' columns df.iloc[:, [0, 2]] = [['Mia', 'Atlanta'], ['Liam', 'Dallas'], ['Sophia', 'Denver'], ['Ethan', 'Phoenix']] print(df)
Output:
Name Age City 0 Mia 35 Atlanta 1 Liam 36 Dallas 2 Sophia 31 Denver 3 Ethan 48 Phoenix
Here, we first target the ‘Age’ column and assign a new list of age values using df.iloc[:, 1]
.
Next, we proceed to set values for both the ‘Name’ and ‘City’ columns simultaneously.
Set Values for a Range of Cells (Both Rows and Columns)
The iloc
property allows you can update a range of cells across both rows and columns, providing a specific slice of values to modify.
# Set values for the cells in the first two rows and last two columns df.iloc[0:2, 1:3] = [[40, 'Orlando'], [37, 'Sacramento']] print(df)
Output:
Name Age City 0 John 40 Orlando 1 Doe 37 Sacramento 2 Jane 31 Chicago 3 Smith 48 Houston
In the demonstration above, we’ve chosen a block of cells spanning the first two rows and the last two columns of the DataFrame.
By using df.iloc[0:2, 1:3]
, we specify this range and set new values for the ‘Age’ and ‘City’ columns for the respective rows.
Remember, when updating a range of cells, the shape of the value you’re assigning should match the shape of the cell range you’re targeting to avoid data inconsistencies.
Boolean Indexing (Use Boolean Arrays/Masks)
Instead of selecting rows or columns by their integer indices, you can use arrays of boolean values (True or False) to filter rows based on certain criteria.
Let’s delve into how you can combine boolean arrays/masks with iloc
to refine your DataFrame selections.
Basic Boolean Indexing
Start by creating a boolean mask based on a condition:
# Create a boolean mask for rows where Age is greater than 35 age_mask = df['Age'] > 35
Now, apply this mask using iloc
:
filtered_data = df.iloc[age_mask.values] print(filtered_data)
Output:
Name Age City 3 Smith 45 Houston
In the example, we first generate a boolean mask age_mask
that identifies rows where the ‘Age’ exceeds 35. When applied with iloc
, only the rows with True
values in the mask are retained.
Combining Multiple Conditions
You can combine multiple conditions using bitwise operators like &
(and), |
(or), and ~
(not).
# Create a mask for rows where Age is greater than 35 and City is 'Houston' combined_mask = (df['Age'] > 35) & (df['City'] == 'Houston') filtered_data = df.iloc[combined_mask.values] print(filtered_data)
Output:
Name Age City 3 Smith 45 Houston
Here, we filter for entries where the individual’s age exceeds 35, and they reside in ‘Phoenix’.
Error Handling and Common Pitfalls
Navigating Pandas DataFrames using iloc
is typically smooth and intuitive. However, there are some potential pitfalls and errors that you might encounter.
One of the common mistakes is trying to access indices that do not exist in the DataFrame, leading to an IndexError
.
# Attempting to access the fifth row in a DataFrame with only four rows # will raise an error. try: print(df.iloc[4]) except IndexError as e: print(f"Error: {e}")
Output:
Error: single positional indexer is out-of-bounds
To avoid this, always ensure that the indices you provide fall within the valid range for your DataFrame.
Resource
https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.iloc.html
Mokhtar is the founder of LikeGeeks.com. He is a seasoned technologist and accomplished author, with expertise in Linux system administration and Python development. Since 2010, Mokhtar has built an impressive career, transitioning from system administration to Python development in 2015. His work spans large corporations to freelance clients around the globe. Alongside his technical work, Mokhtar has authored some insightful books in his field. Known for his innovative solutions, meticulous attention to detail, and high-quality work, Mokhtar continually seeks new challenges within the dynamic field of technology.