How to Flatten YAML File using Python

In this tutorial, you’ll learn various Python methods to flatten YAML files.

We’ll explore different methods, from writing custom recursive functions to using specialized libraries.

 

 

Using Recursive Function

You can create a recursive function to traverse the YAML structure and flatten it into a single dictionary.

import yaml
yaml_content = """
employee:
name: Amina
details:
  age: 30
  department:
    name: Engineering
    floor: 5
"""
def flatten_dict(d, parent_key='', sep='.'):
  items = {}
  for k, v in d.items():
      new_key = f"{parent_key}{sep}{k}" if parent_key else k
      if isinstance(v, dict):
          items.update(flatten_dict(v, new_key, sep=sep))
      else:
          items[new_key] = v
  return items
data = yaml.safe_load(yaml_content)
flat_data = flatten_dict(data)
flat_yaml = yaml.dump(flat_data, default_flow_style=False)
print(flat_yaml)

Output:

details.age: 30
details.department.floor: 5
details.department.name: Engineering
employee: null
name: Amina

 

Manually Traversing

You can manually traverse the YAML structure to flatten it without using recursion.

import yaml
yaml_content = """
company:
  ceo: Karim
  employees:
    - name: Layla
      role: Designer
    - name: Omar
      role: Developer
"""
data = yaml.safe_load(yaml_content)
flat_data = {}
flat_data['company.ceo'] = data['company']['ceo']
for idx, emp in enumerate(data['company']['employees']):
    flat_data[f'company.employees.{idx}.name'] = emp['name']
    flat_data[f'company.employees.{idx}.role'] = emp['role']
flat_yaml = yaml.dump(flat_data, default_flow_style=False)
print(flat_yaml)

Output:

company.ceo: Karim
company.employees.0.name: Layla
company.employees.0.role: Designer
company.employees.1.name: Omar
company.employees.1.role: Developer

 

Using flatdict

You can use the flatdict library to simplify the flattening process.

import yaml
import flatdict
yaml_content = """
project:
title: CairoApp
team:
  leader: Sara
  members:
    frontend: Tarek
    backend: Mona
"""
data = yaml.safe_load(yaml_content)
flat = flatdict.FlatDict(data, delimiter='.')
flat_dict = dict(flat)
flat_yaml = yaml.dump(flat_dict, default_flow_style=False)
print(flat_yaml)

Output:

project: null
team.leader: Sara
team.members.backend: Mona
team.members.frontend: Tarek
title: CairoApp

 

Using flatten-dict

Another method is using the flatten-dict library to flatten the YAML content.

import yaml
from flatten_dict import flatten
yaml_content = """
university:
  name: Alexandria University
  faculties:
    engineering:
      head: Youssef
    arts:
      head: Nadia
"""
data = yaml.safe_load(yaml_content)
flat_data = flatten(data, reducer='dot')
flat_yaml = yaml.dump(flat_data, default_flow_style=False)
print(flat_yaml)

Output:

university.faculties.arts.head: Nadia
university.faculties.engineering.head: Youssef
university.name: Alexandria University

 

Using Pandas

You can use Pandas json_normalize to flatten a YAML file by normalizing the nested structures into a DataFrame.

import yaml
import pandas as pd
yaml_content = """
store:
  books:
    - title: "Python Basics"
      author: "Hassan"
    - title: "Advanced Python"
      author: "Maya"
  location: "Downtown"
"""
data = yaml.safe_load(yaml_content)
books = pd.json_normalize(data['store']['books'])
books['location'] = data['store']['location']
flat_data = books.to_dict(orient='records')
flat_yaml = yaml.dump(flat_data, default_flow_style=False)
print(flat_yaml)

Output:

- author: Hassan
  location: Downtown
  title: Python Basics
- author: Maya
  location: Downtown
  title: Advanced Python
Leave a Reply

Your email address will not be published. Required fields are marked *