Using the Pandas Query Method on Index Columns
While the query
method in Pandas is used for column-based operations, it is capable of handling index-based queries.
In this tutorial, you will learn how to query index columns using query
method.
Querying Rows Based on Index Column Using query
Let’s consider the same sample data:
import pandas as pd data = {'Name': ['User1', 'User2', 'User3', 'User4'], 'MonthlyRevenue': [2000, 1500, 3000, 4000], 'ContractLength': [12, 24, 12, 18]} df = pd.DataFrame(data) df.set_index('Name', inplace=True) print(df)
Output:
MonthlyRevenue ContractLength Name User1 2000 12 User2 1500 24 User3 3000 12 User4 4000 18
To query based on the index column, you need to use the index
identifier in your query string:
result = df.query("index == 'User1'") print(result)
Output:
MonthlyRevenue ContractLength Name User1 2000 12
Slicing the Index to Select a Range of Rows
To select a range of rows, you can use the index
identifier and the slicing notation in the query
method:
result = df.query("index >= 'User1' and index <= 'User3'") print(result)
Output:
MonthlyRevenue ContractLength Name User1 2000 12 User2 1500 24 User3 3000 12
Here, we see that the query has successfully returned the rows with index labels from 'User1'
to 'User3'
, inclusive.
Using Boolean Indexing on the Index
Boolean indexing is a powerful feature that allows you to filter rows based on certain conditions.
When paired with the query
method, you can perform complex filtering criteria with ease.
Suppose you want to filter rows where the index label contains the digit1
. To do this, you can use the str.contains
method along with query
result = df.query("index.str.contains('1')") print(result)
Output:
MonthlyRevenue ContractLength Name User1 2000 12
Querying with Inequality Operators
You can use the inequality operators like <
, <=
, >
, and >=
on index columns when they are lexicographically sortable (i.e., string-based or numeric).
Let’s say you want to filter rows where the index label is lexicographically greater than ‘User2’.
To do this, you can use the inequality operator >
.
result = df.query("index > 'User2'") print(result)
Output:
MonthlyRevenue ContractLength Name User3 3000 12 User4 4000 18
The output shows that the query method filters out rows where the index labels ‘User3’ and ‘User4’ are lexicographically greater than ‘User2’.
Combining Multiple Conditions When Querying the Index
The query
method supports combining conditions using logical operators like and
, or
, and not
.
Suppose you want to select rows where the index label is either ‘User1’ or ‘User3’ and the MonthlyRevenue
is above 2000.
You can combine these conditions as follows:
result = df.query("(index == 'User1' or index == 'User3') and MonthlyRevenue > 2000") print(result)
Output:
MonthlyRevenue ContractLength Name User3 3000 12
Select Rows Based on Index on a MultiIndex DataFrame
The query
method can work efficiently with MultiIndex DataFrames as well.
Here’s some sample data with a MultiIndex:
import pandas as pd arrays = [ ['User1', 'User1', 'User2', 'User2'], ['Jan', 'Feb', 'Jan', 'Feb'] ] index = pd.MultiIndex.from_arrays(arrays, names=('Name', 'Month')) data = {'MonthlyRevenue': [2000, 1800, 1500, 1600], 'ContractLength': [12, 12, 24, 24]} df = pd.DataFrame(data, index=index) print(df)
Output:
MonthlyRevenue ContractLength Name Month User1 Jan 2000 12 Feb 1800 12 User2 Jan 1500 24 Feb 1600 24
To query this MultiIndex DataFrame, you can use multiple levels of the index in your query string.
For instance, to select rows where the ‘Name’ is ‘User1’ and the ‘Month’ is ‘Jan’:
result = df.query("Name == 'User1' and Month == 'Jan'") print(result)
Output:
MonthlyRevenue ContractLength Name Month User1 Jan 2000 12
Mokhtar is the founder of LikeGeeks.com. He is a seasoned technologist and accomplished author, with expertise in Linux system administration and Python development. Since 2010, Mokhtar has built an impressive career, transitioning from system administration to Python development in 2015. His work spans large corporations to freelance clients around the globe. Alongside his technical work, Mokhtar has authored some insightful books in his field. Known for his innovative solutions, meticulous attention to detail, and high-quality work, Mokhtar continually seeks new challenges within the dynamic field of technology.