Filter Using Pandas query method with multiple conditions
The query
method in Pandas allows you to filter your DataFrame using string expressions.
In this tutorial, you’ll learn how to use the query
method to filter data with multiple conditions.
Combining Conditions with & (AND)
When you need to combine multiple conditions, the ampersand (&
) serves as the logical AND operator.
The syntax is straightforward: you specify each condition within a string, and separate them using &
.
Firstly, let’s create a sample DataFrame:
import pandas as pd data = { 'ID': [1, 2, 3, 4, 5], 'Name': ['John', 'Emily', 'Michael', 'Sarah', 'Jessica'], 'Age': [28, 24, 22, 25, 29], 'Salary': [50000, 55000, 60000, 65000, 70000] } df = pd.DataFrame(data) print(df)
Output:
ID Name Age Salary 0 1 John 28 50000 1 2 Emily 24 55000 2 3 Michael 22 60000 3 4 Sarah 25 65000 4 5 Jessica 29 70000
Here, you have a DataFrame with the columns ID
, Name
, Age
, and Salary
.
Now, let’s use the query
method to filter out records where Age
is greater than 24 and Salary
is less than 70000:
filtered_df = df.query("Age > 24 & Salary < 70000") print(filtered_df)
Output:
ID Name Age Salary 0 1 John 28 50000 3 4 Sarah 25 65000
Combining Conditions with | (OR)
The pipe character (|
) acts as the logical OR operator in the query
method.
With this operator, you can find records that satisfy either one condition or another, or even both.
Continuing from the previous example, let’s filter the records where either Age
is less than 25 or Salary
is greater than 60000:
filtered_df = df.query("Age < 25 | Salary > 60000") print(filtered_df)
Output:
ID Name Age Salary 1 2 Emily 24 55000 2 3 Michael 22 60000 3 4 Sarah 25 65000 4 5 Jessica 29 70000
In this output, you’ll find that the records meet at least one of the specified conditions—either Age
is less than 25 or Salary
is greater than 60000.
Combining Conditions with ~ (NOT)
Sometimes, you’ll find yourself in a situation where you want to exclude specific records based on a condition.
The tilde (~
) serves as the logical NOT operator in the query
method.
By incorporating ~
, you can filter out records that meet a particular condition.
Let’s use the query
method to filter out records where the Age
is not greater than 24:
filtered_df = df.query("~(Age > 24)") print(filtered_df)
Output:
ID Name Age Salary 1 2 Emily 24 55000 2 3 Michael 22 60000
The ~
operator negates the condition specified.
Parentheses and Order of Operations
You can use parentheses to dictate the order in which conditions are evaluated in the query
method.
Suppose you want to select records where either the Age
is greater than 24 and the Salary
is less than 70000, or the Age
is less than 25.
filtered_df = df.query("(Age > 24 & Salary < 70000) | (Age < 25)") print(filtered_df)
Output:
ID Name Age Salary 0 1 John 28 50000 1 2 Emily 24 55000 2 3 Michael 22 60000 3 4 Sarah 25 65000
Here, parentheses make it clear that the AND operation (&
) should be carried out before the OR operation (|
).
Add Regex Condition
In the context of Pandas query
method, you can integrate regex to add another dimension to your data filtering capabilities.
The .str.contains()
function allows you to search for patterns within string-based columns.
Let’s modify our DataFrame a bit to include an Email
column:
df['Email'] = ['john@email.com', 'emily@email.com', 'michael@email.com', 'sarah@email.com', 'jessica@email.com'] print(df)
Output:
ID Name Age Salary Email 0 1 John 28 50000 john@email.com 1 2 Emily 24 55000 emily@email.com 2 3 Michael 22 60000 michael@email.com 3 4 Sarah 25 65000 sarah@email.com 4 5 Jessica 29 70000 jessica@email.com
Now let’s use .str.contains()
along with the query
method to filter out records where the Email
column starts with the letter “j” and Age
is greater than 28:
filtered_df = df.query('Email.str.contains("^j") & Age>28') print(filtered_df)
Output:
ID Name Age Salary Email 4 5 Jessica 29 70000 jessica@email.com
Mokhtar is the founder of LikeGeeks.com. He is a seasoned technologist and accomplished author, with expertise in Linux system administration and Python development. Since 2010, Mokhtar has built an impressive career, transitioning from system administration to Python development in 2015. His work spans large corporations to freelance clients around the globe. Alongside his technical work, Mokhtar has authored some insightful books in his field. Known for his innovative solutions, meticulous attention to detail, and high-quality work, Mokhtar continually seeks new challenges within the dynamic field of technology.