8 Examples for Selecting Columns in Pandas query method
The query
method in Pandas is used to filter rows based on a condition that is specified as a string.
However, you can use it in conjunction with other functions to select columns as well.
In this tutorial, you’ll learn how to use the query
method to select columns from a Pandas DataFrame.
Using eval Function
You can use eval
function to select the columns after querying rows.
The following code demonstrates how you can use the query
and eval
functions to select specific columns from a DataFrame.
import pandas as pd df = pd.DataFrame({ 'ID': [1, 2, 3, 4], 'Age': [25, 30, 35, 40], 'Salary': [50000, 60000, 70000, 80000] }) filtered_df = df.query('Age > 30') selected_columns = filtered_df.eval('Age + Salary') print(selected_columns)
Output:
2 70700 3 80800 dtype: int64
Using loc or iloc
You can use loc
or iloc
after using query
to filter rows. This is the more commonly used approach.
filtered_df = df.query('Age > 30').loc[:, ['ID', 'Age']] print(filtered_df)
Output:
ID Age 2 3 35 3 4 40
Using Square Brackets
You can use square brackets to select specific columns. However, this is not a straightforward column selection but can be useful in some cases.
filtered_df = df.query('Age > 30').query('Salary > 60000')[['ID', 'Salary']] print(filtered_df)
Output:
ID Salary 2 3 70000 3 4 80000
Using filter with query
You can combine the filter
method with the query
method for column selection.
This is useful when you want to both filter rows and select columns using regex or other criteria.
# Query rows and then filter columns using `filter` filtered_df = df.query('Age > 30').filter(items=['ID', 'Salary']) print(filtered_df)
Output:
ID Salary 2 3 70000 3 4 80000
Using query in Combination with drop
You can first use the query
method to filter rows and then use the drop
method to remove the columns you don’t need.
# Query rows and then drop unwanted columns filtered_df = df.query('Age > 30').drop(columns=['Age']) print(filtered_df)
Output:
ID Salary 2 3 70000 3 4 80000
Select Columns Dynamically Using List Comprehensions
You can use list comprehensions to dynamically generate column names you want to select, and use them post-query.
# Generate dynamic column names, useful for wide dataframes selected_columns = [col for col in df.columns if 'ID' in col or 'Salary' in col] # Filter rows using query and then select columns filtered_df = df.query('Age > 30')[selected_columns] print(filtered_df)
Output:
ID Salary 2 3 70000 3 4 80000
Using apply and query
Another way to use the query
method in column selection is by transposing the DataFrame, using apply
for row-level operations, and then transposing back to the original structure.
This is a more intricate approach but offers flexibility.
df = pd.DataFrame({ 'ID': [1, 2, 3, 4], 'Age': [25, 30, 35, 40], 'Salary': [50000, 60000, 70000, 80000] }) # Transpose, apply query, and transpose back selected_columns_df = df.T.apply(lambda x: x).query('index == "ID" or index == "Salary"').T print(selected_columns_df)
Output:
ID Salary 0 1 50000 1 2 60000 2 3 70000 3 4 80000
Mokhtar is the founder of LikeGeeks.com. He is a seasoned technologist and accomplished author, with expertise in Linux system administration and Python development. Since 2010, Mokhtar has built an impressive career, transitioning from system administration to Python development in 2015. His work spans large corporations to freelance clients around the globe. Alongside his technical work, Mokhtar has authored some insightful books in his field. Known for his innovative solutions, meticulous attention to detail, and high-quality work, Mokhtar continually seeks new challenges within the dynamic field of technology.