Handle variables and references in YAML files using Python

This tutorial will guide you through managing variables and references in YAML files using Python.

We’ll cover everything from basic YAML variable syntax to advanced methods like anchors, aliases, and variable interpolation.

 

 

YAML Variable Syntax

Scalar variables

Scalar variables represent single values like strings, integers, or booleans. You can define scalar variables in YAML like this:

name: Ali
age: 28
is_active: true

In Python, you can load these variables using the yaml library:

import yaml
with open('data.yaml', 'r') as file:
    data = yaml.safe_load(file)
    print(data)

Output:

{'name': 'Ali', 'age': 28, 'is_active': True}

This output shows the scalar variables loaded into a Python dictionary with their corresponding values.

List variables

You can define list variables to store multiple values in YAML. Here’s how you can do it:

cities:
  - Cairo
  - Alexandria
  - Giza

Alternatively, using inline notation:

numbers: [1, 2, 3, 4, 5]

In Python, you can access these lists after loading the YAML file:

import yaml
with open('data.yaml', 'r') as file:
    data = yaml.safe_load(file)
    print(data['cities'])
    print(data['numbers'])

Output:

['Cairo', 'Alexandria', 'Giza']
[1, 2, 3, 4, 5]

This output shows the list variables loaded as Python lists.

Dictionary variables

Dictionary variables hold key-value pairs and can be nested for complex structures. You can define them in YAML like this:

employee:
  name: Sarah
  department: Engineering
  skills:
    programming: Python
    database: MySQL

In Python, you can access nested dictionary variables:

import yaml
with open('data.yaml', 'r') as file:
    data = yaml.safe_load(file)
    print(data['employee'])

Output:

{'name': 'Sarah', 'department': 'Engineering', 'skills': {'programming': 'Python', 'database': 'MySQL'}}

This output shows the dictionary variables loaded as nested Python dictionaries.

Multi-line strings

To define multi-line strings in YAML, you can use the pipe | symbol. Here’s how:

description: |
  This is a multi-line string.
  It preserves line breaks.
  Useful for long texts.

In Python, you can read the multi-line string:

import yaml
with open('data.yaml', 'r') as file:
    data = yaml.safe_load(file)
    print(data['description'])

Output:

This is a multi-line string.
It preserves line breaks.
Useful for long texts.

This output shows the multi-line string with line breaks preserved.

Override variables

You can override variables by redefining them within a nested structure. Here’s how you can do it:

default_database: postgres

environments:
  development:
    database: sqlite
  production:
    database: mysql

In Python, you can access the overridden variables:

import yaml
with open('databases.yaml', 'r') as file:
    data = yaml.safe_load(file)
    dev_db = data['environments']['development'].get('database', data['default_database'])
    prod_db = data['environments']['production'].get('database', data['default_database'])
    print('Development DB:', dev_db)
    print('Production DB:', prod_db)

Output:

Development DB: sqlite
Production DB: mysql

This output shows that each environment overrides the default_database with its own value.

 

Anchors and Aliases

Anchors allow you to reference and reuse nodes in YAML. You can create an anchor using the & symbol:

base_config: &base
  host: localhost
  port: 3306

You can reference an anchor using the * symbol followed by the anchor name. Here’s how:

development:
  <<: *base
  database: dev_db

production:
  <<: *base
  database: prod_db
  host: prod.server.com

In Python, you can load the YAML file and see the resolved references:

import yaml
with open('config.yaml', 'r') as file:
    data = yaml.safe_load(file)
    print('Development Config:', data['development'])
    print('Production Config:', data['production'])

Output:

Development Config: {'host': 'localhost', 'port': 3306, 'database': 'dev_db'}
Production Config: {'host': 'prod.server.com', 'port': 3306, 'database': 'prod_db'}

This output shows that both environments inherit from base_config and override specific values.

The merge key << allows you to merge dictionaries. Here’s an example:

defaults: &defaults
  retries: 3
  timeout: 30

service:
  <<: *defaults
  endpoint: /api/service

In Python, you can see the merged dictionary:

import yaml
with open('service.yaml', 'r') as file:
    data = yaml.safe_load(file)
    print(data['service'])

Output:

{'retries': 3, 'timeout': 30, 'endpoint': '/api/service'}

This output shows that service inherits retries and timeout from defaults and adds its own endpoint.

 

Variable Interpolation

YAML doesn’t support variable interpolation natively, but you can perform it in Python. Here’s a YAML file with placeholders:

message: "Welcome, ${name}!"

In Python, you can replace the placeholders:

import yaml
with open('message.yaml', 'r') as file:
    data = yaml.safe_load(file)
    message = data['message'].replace('${name}', 'Youssef')
    print(message)

Output:

Welcome, Youssef!

This output shows the placeholder ${name} replaced with ‘Youssef’.

Using environment variables

You can use environment variables in your YAML files and substitute them in Python. Here’s how:

config:
  path: ${USERDIR}\app\config

In Python, you can perform the substitution:

import yaml
import os
with open('config.yaml', 'r') as file:
    data = yaml.safe_load(file)
    path = data['config']['path'].replace('${HOME}', os.environ['USERPROFILE'])
    print('Config Path:', path)

Output:

Config Path: C:\Users\Mokhtar\app\config

This output shows the ${USERDIR} placeholder replaced with the actual user directory from the environment variable.

Nested variable references

You can reference other variables within the YAML file. Here’s an example:

variables:
  base_url: http://example.com
endpoints:
  user: ${variables.base_url}/user
  admin: ${variables.base_url}/admin

In Python, you can substitute the nested variables:

import yaml
with open('endpoints.yaml', 'r') as file:
    data = yaml.safe_load(file)
    base_url = data['variables']['base_url']
    user_endpoint = data['endpoints']['user'].replace('${variables.base_url}', base_url)
    admin_endpoint = data['endpoints']['admin'].replace('${variables.base_url}', base_url)
    print('User Endpoint:', user_endpoint)
    print('Admin Endpoint:', admin_endpoint)

Output:

User Endpoint: http://example.com/user
Admin Endpoint: http://example.com/admin

This output shows the nested variable references resolved correctly.

Use Variables from Another YAML File

You can use variables from one YAML file in another by loading both files in Python. Here’s variables.yaml:

credentials:
  username: kamal
  password: pass123

And config.yaml:

database:
  host: localhost
  user: ${credentials.username}
  pass: ${credentials.password}

In Python, you can combine them:

import yaml
with open('variables.yaml', 'r') as file:
    variables = yaml.safe_load(file)
with open('config.yaml', 'r') as file:
    config = yaml.safe_load(file)
credentials = variables['credentials']
config['database']['user'] = config['database']['user'].replace('${credentials.username}', credentials['username'])
config['database']['pass'] = config['database']['pass'].replace('${credentials.password}', credentials['password'])
print(config['database'])

Output:

{'host': 'localhost', 'user': 'kamal', 'pass': 'pass123'}

This output shows that the variables from variables.yaml are used in config.yaml.

 

Variable Substitution

You can define placeholders in your YAML file and use Python string formatting to substitute them. Here’s how:

template: "Dear {name}, your order {order_id} is confirmed."

In Python, perform the substitution:

import yaml
with open('template.yaml', 'r') as file:
    data = yaml.safe_load(file)
    message = data['template'].format(name='Salma', order_id='12345')
    print(message)

Output:

Dear Salma, your order 12345 is confirmed.

This output shows the placeholders {name} and {order_id} replaced with actual values.

Leave a Reply

Your email address will not be published. Required fields are marked *