8 Examples for Selecting Columns in Pandas query method

The query method in Pandas is used to filter rows based on a condition that is specified as a string.

However, you can use it in conjunction with other functions to select columns as well.

In this tutorial, you’ll learn how to use the query method to select columns from a Pandas DataFrame.

 

 

Using eval Function

You can use eval function to select the columns after querying rows.

The following code demonstrates how you can use the query and eval functions to select specific columns from a DataFrame.

import pandas as pd
df = pd.DataFrame({
    'ID': [1, 2, 3, 4],
    'Age': [25, 30, 35, 40],
    'Salary': [50000, 60000, 70000, 80000]
})
filtered_df = df.query('Age > 30')
selected_columns = filtered_df.eval('Age + Salary')
print(selected_columns)

Output:

2    70700
3    80800
dtype: int64

 

Using loc or iloc

You can use loc or iloc after using query to filter rows. This is the more commonly used approach.

filtered_df = df.query('Age > 30').loc[:, ['ID', 'Age']]
print(filtered_df)

Output:

   ID  Age
2   3   35
3   4   40

 

Using Square Brackets

You can use square brackets to select specific columns. However, this is not a straightforward column selection but can be useful in some cases.

filtered_df = df.query('Age > 30').query('Salary > 60000')[['ID', 'Salary']]
print(filtered_df)

Output:

   ID  Salary
2   3   70000
3   4   80000

 

Using filter with query

You can combine the filter method with the query method for column selection.

This is useful when you want to both filter rows and select columns using regex or other criteria.

# Query rows and then filter columns using `filter`
filtered_df = df.query('Age > 30').filter(items=['ID', 'Salary'])
print(filtered_df)

Output:

   ID  Salary
2   3   70000
3   4   80000

 

Using query in Combination with drop

You can first use the query method to filter rows and then use the drop method to remove the columns you don’t need.

# Query rows and then drop unwanted columns
filtered_df = df.query('Age > 30').drop(columns=['Age'])
print(filtered_df)

Output:

   ID  Salary
2   3   70000
3   4   80000

 

Select Columns Dynamically Using List Comprehensions

You can use list comprehensions to dynamically generate column names you want to select, and use them post-query.

# Generate dynamic column names, useful for wide dataframes
selected_columns = [col for col in df.columns if 'ID' in col or 'Salary' in col]

# Filter rows using query and then select columns
filtered_df = df.query('Age > 30')[selected_columns]
print(filtered_df)

Output:

   ID  Salary
2   3   70000
3   4   80000

 

Using apply and query

Another way to use the query method in column selection is by transposing the DataFrame, using apply for row-level operations, and then transposing back to the original structure.

This is a more intricate approach but offers flexibility.

df = pd.DataFrame({
    'ID': [1, 2, 3, 4],
    'Age': [25, 30, 35, 40],
    'Salary': [50000, 60000, 70000, 80000]
})

# Transpose, apply query, and transpose back
selected_columns_df = df.T.apply(lambda x: x).query('index == "ID" or index == "Salary"').T
print(selected_columns_df)

Output:

  ID Salary
0  1  50000
1  2  60000
2  3  70000
3  4  80000

 

Leave a Reply

Your email address will not be published. Required fields are marked *