Using variables in Pandas query (Build Dynamic queries)
The query
method allows you to filter and subset DataFrames in Pandas.
It enables you to write SQL-like queries to filter your data.
In this tutorial, we’ll go through the process of incorporating variables in your query
statements to make dynamic queries.
Using ‘@’ Symbol for Direct Substitution
The @
symbol allows you to incorporate Python variables into your query string.
First, you’ll need to create a sample DataFrame.
import pandas as pd data = { 'CustomerID': [1, 2, 3, 4, 5], 'PlanType': ['Basic', 'Premium', 'Basic', 'Enterprise', 'Premium'], 'MonthlyCharge': [10, 30, 10, 50, 30] } df = pd.DataFrame(data) print(df)
Output:
CustomerID PlanType MonthlyCharge 0 1 Basic 10 1 2 Premium 30 2 3 Basic 10 3 4 Enterprise 50 4 5 Premium 30
Let’s say you want to filter rows where MonthlyCharge
is 30.
You can store this value in a variable and then use it in the query
method with the @
symbol.
charge_to_filter = 30 filtered_df = df.query('MonthlyCharge == @charge_to_filter') print(filtered_df)
Output:
CustomerID PlanType MonthlyCharge 1 2 Premium 30 4 5 Premium 30
Using String Formatting Methods
Another way to use variables in the Pandas query
method is by using Python’s string formatting.
This technique is useful when you’re dealing with more complex queries or when you need to build a query string dynamically.
Using f-strings
Python 3.6 introduced f-strings, a way to embed expressions inside string literals using curly braces {}
. Let’s use f-strings to filter rows based on multiple conditions.
Suppose you have a variable for MonthlyCharge
and another one for PlanType
, and you want to filter data based on both.
charge_to_filter = 30 plan_to_filter = 'Premium' query_str = f"MonthlyCharge == {charge_to_filter} and PlanType == '{plan_to_filter}'" filtered_df = df.query(query_str) print(filtered_df)
Output:
CustomerID PlanType MonthlyCharge 1 2 Premium 30 4 5 Premium 30
Notice how the query string was built using f-string formatting. The variables charge_to_filter
and plan_to_filter
are directly embedded into the query string.
Using .format() Method
If you’re using a version of Python prior to 3.6, or if you prefer the .format()
method for string interpolation, you can get the same result.
query_str = "MonthlyCharge == {} and PlanType == '{}'".format(charge_to_filter, plan_to_filter) filtered_df = df.query(query_str) print(filtered_df)
Output:
CustomerID PlanType MonthlyCharge 1 2 Premium 30 4 5 Premium 30
Use of Logical Operators with Variables
In Pandas, you can use AND
, OR
, and NOT
operators along with variables in the query
method to apply multiple conditions.
Using AND Operator
You can combine multiple conditions using the AND
operator. For example, to filter rows based on a specific PlanType
and a range of MonthlyCharge
values, you can do the following:
min_charge = 20 max_charge = 40 plan_to_filter = 'Premium' # Using AND operator filtered_df = df.query('@min_charge <= MonthlyCharge <= @max_charge and PlanType == @plan_to_filter') print(filtered_df)
Output:
CustomerID PlanType MonthlyCharge 1 2 Premium 30 4 5 Premium 30
The output DataFrame contains rows where the MonthlyCharge
is between 20 and 40 and the PlanType
is ‘Premium’.
Using OR Operator
With the OR
operator, you can filter rows that meet at least one of several conditions.
filtered_df = df.query('MonthlyCharge == @min_charge or PlanType == @plan_to_filter') print(filtered_df)
Output:
CustomerID PlanType MonthlyCharge 1 2 Premium 30 4 5 Premium 30
The output DataFrame includes rows where either the MonthlyCharge
is 20 or the PlanType
is ‘Premium’.
Using NOT Operator
The NOT
operator allows you to exclude rows that meet certain conditions.
filtered_df = df.query('not (MonthlyCharge > @min_charge and PlanType == @plan_to_filter)') print(filtered_df)
Output:
CustomerID PlanType MonthlyCharge 0 1 Basic 10 2 3 Basic 10 3 4 Enterprise 50
Mokhtar is the founder of LikeGeeks.com. He is a seasoned technologist and accomplished author, with expertise in Linux system administration and Python development. Since 2010, Mokhtar has built an impressive career, transitioning from system administration to Python development in 2015. His work spans large corporations to freelance clients around the globe. Alongside his technical work, Mokhtar has authored some insightful books in his field. Known for his innovative solutions, meticulous attention to detail, and high-quality work, Mokhtar continually seeks new challenges within the dynamic field of technology.