How to Filter YAML Data in Python
This tutorial will guide you through various methods to filter YAML data in Python.
You’ll learn how to extract specific information, apply conditions, and manipulate YAML data structures using Python built-in features and libraries.
Basic Filtering
Filter by key
To filter YAML data by key, you can load the YAML content and access specific keys directly:
import yaml yaml_data = """ employee: name: "Fatima" age: 28 department: "Marketing" """ data = yaml.safe_load(yaml_data) # Access the 'name' key name = data['employee']['name'] print(name)
Output:
Fatima
Filter by value
You can filter data based on the value of a specific key:
import yaml yaml_data = """ employees: - name: "Omar" age: 35 department: "Finance" - name: "Layla" age: 29 department: "Engineering" - name: "Hassan" age: 42 department: "Finance" """ data = yaml.safe_load(yaml_data) # Filter employees in the Finance department finance_employees = [emp for emp in data['employees'] if emp['department'] == "Finance"] print(finance_employees)
Output:
[{'name': 'Omar', 'age': 35, 'department': 'Finance'}, {'name': 'Hassan', 'age': 42, 'department': 'Finance'}]
Using list comprehensions
You can use list comprehensions for filtering YAML:
import yaml yaml_data = """ products: - name: "Laptop" price: 1500 in_stock: true - name: "Smartphone" price: 800 in_stock: false - name: "Tablet" price: 600 in_stock: true """ data = yaml.safe_load(yaml_data) # Get names of products that are in stock in_stock_products = [product['name'] for product in data['products'] if product['in_stock']] print(in_stock_products)
Output:
['Laptop', 'Tablet']
Apply lambda functions
Lambda functions can be used to filter YAML data dynamically:
import yaml yaml_data = """ students: - name: "Noura" grade: 85 - name: "Karim" grade: 92 - name: "Salma" grade: 78 """ data = yaml.safe_load(yaml_data) # Filter students with grades above 80 high_achievers = list(filter(lambda s: s['grade'] > 80, data['students'])) print(high_achievers)
Output:
[{'name': 'Noura', 'grade': 85}, {'name': 'Karim', 'grade': 92}]
This outputs a list of students who scored above 80 by applying a lambda function within the filter()
function.
Regular Expression Filtering
Filter keys with regex
You can use regular expressions to match YAML keys:
import yaml import re yaml_data = """ measurements: temp_morning: 20 temp_evening: 15 humidity_morning: 80 humidity_evening: 70 """ data = yaml.safe_load(yaml_data) # Filter keys that start with 'temp_' temp_measurements = {k: v for k, v in data['measurements'].items() if re.match(r'^temp_', k)} print(temp_measurements)
Output:
{'temp_morning': 20, 'temp_evening': 15}
This outputs a dictionary of measurements where keys start with ‘temp_’ by using a regular expression.
Filter values with regex
Regular expressions can also filter based on string values:
import yaml import re yaml_data = """ logs: - date: "2023-10-01" message: "Error: failed to load module" - date: "2023-10-02" message: "Warning: deprecated API usage" - date: "2023-10-03" message: "Error: null pointer exception" """ data = yaml.safe_load(yaml_data) # Filter logs containing 'Error' in the message error_logs = [log for log in data['logs'] if re.search(r'Error', log['message'])] print(error_logs)
Output:
[{'date': '2023-10-01', 'message': 'Error: failed to load module'}, {'date': '2023-10-03', 'message': 'Error: null pointer exception'}]
This outputs a list of logs where the ‘message’ contains the word ‘Error’ by searching with a regular expression.
Filter based on multiple conditions
You can filter YAML based on multiple criteria:
import yaml yaml_data = """ books: - title: "Python Basics" author: "Aisha" year: 2019 available: true - title: "Advanced Python" author: "Hossam" year: 2021 available: false - title: "Data Science with Python" author: "Aisha" year: 2020 available: true """ data = yaml.safe_load(yaml_data) # Filter books by author 'Aisha' that are available available_books_by_aisha = [book for book in data['books'] if book['author'] == "Aisha" and book['available']] print(available_books_by_aisha)
Output:
[{'title': 'Python Basics', 'author': 'Aisha', 'year': 2019, 'available': True}, {'title': 'Data Science with Python', 'author': 'Aisha', 'year': 2020, 'available': True}]
This outputs a list of books authored by ‘Aisha’ that are currently available, filtering based on both ‘author’ and ‘available’ fields.
Mokhtar is the founder of LikeGeeks.com. He is a seasoned technologist and accomplished author, with expertise in Linux system administration and Python development. Since 2010, Mokhtar has built an impressive career, transitioning from system administration to Python development in 2015. His work spans large corporations to freelance clients around the globe. Alongside his technical work, Mokhtar has authored some insightful books in his field. Known for his innovative solutions, meticulous attention to detail, and high-quality work, Mokhtar continually seeks new challenges within the dynamic field of technology.