How to Convert YAML data To SQL in Python
In this tutorial, you’ll learn how to convert YAML data to SQL using Python.
You’ll learn how to handle various YAML structures, from simple key-value pairs to complex nested data.
Basic YAML to SQL Insert
To start, read a simple YAML file and insert its data into a SQL table.
import yaml import sqlite3 yaml_data = """ employees: - name: Ahmed age: 30 department: Engineering - name: Fatima age: 28 department: Marketing """ data = yaml.safe_load(yaml_data) conn = sqlite3.connect(':memory:') cursor = conn.cursor() cursor.execute(''' CREATE TABLE employees ( id INTEGER PRIMARY KEY AUTOINCREMENT, name TEXT, age INTEGER, department TEXT ) ''') for emp in data['employees']: cursor.execute(''' INSERT INTO employees (name, age, department) VALUES (?, ?, ?) ''', (emp['name'], emp['age'], emp['department'])) cursor.execute('SELECT * FROM employees') rows = cursor.fetchall() print(rows)
Output:
[(1, 'Ahmed', 30, 'Engineering'), (2, 'Fatima', 28, 'Marketing')]
This code reads the YAML data containing a list of employees, creates an in-memory SQLite database, defines an employees
table, and inserts each employee into the table.
Finally, you fetch and print all rows from the table.
Handle Nested YAML
To handle nested YAML data and convert it into multiple related SQL tables, parse the nested structures and maintain parent-child relationships.
import yaml import sqlite3 yaml_data = """ departments: - name: Engineering employees: - name: Ahmed age: 30 - name: Mona age: 25 - name: Marketing employees: - name: Fatima age: 28 - name: Karim age: 32 """ data = yaml.safe_load(yaml_data) conn = sqlite3.connect(':memory:') cursor = conn.cursor() cursor.execute(''' CREATE TABLE departments ( id INTEGER PRIMARY KEY AUTOINCREMENT, name TEXT ) ''') cursor.execute(''' CREATE TABLE employees ( id INTEGER PRIMARY KEY AUTOINCREMENT, name TEXT, age INTEGER, department_id INTEGER, FOREIGN KEY(department_id) REFERENCES departments(id) ) ''') for dept in data['departments']: # Insert department cursor.execute('INSERT INTO departments (name) VALUES (?)', (dept['name'],)) department_id = cursor.lastrowid # Insert employees for this department for emp in dept['employees']: cursor.execute(''' INSERT INTO employees (name, age, department_id) VALUES (?, ?, ?) ''', (emp['name'], emp['age'], department_id)) cursor.execute(''' SELECT employees.name, employees.age, departments.name FROM employees JOIN departments ON employees.department_id = departments.id ''') rows = cursor.fetchall() for row in rows: print(row)
Output:
('Ahmed', 30, 'Engineering') ('Mona', 25, 'Engineering') ('Fatima', 28, 'Marketing') ('Karim', 32, 'Marketing')
After parsing the nested YAML data, we insert each department into the departments
table and each employee into the employees
table.
We add the foreign key relationship via department_id
.
By joining the two tables, we display each employee along with their department name.
Process YAML Anchors and Aliases
To handle YAML anchors and aliases and avoid duplication in data, process them in Python to correctly insert referenced data into SQL.
import yaml import sqlite3 yaml_data = """ default_employee: &default_employee age: 30 department: 'Engineering' employees: - name: Ahmed <<: *default_employee - name: Fatima age: 28 department: 'Marketing' """ data = yaml.safe_load(yaml_data) # Connect to SQLite database conn = sqlite3.connect(':memory:') cursor = conn.cursor() # Create table cursor.execute(''' CREATE TABLE employees ( id INTEGER PRIMARY KEY AUTOINCREMENT, name TEXT, age INTEGER, department TEXT ) ''') # Insert data into table for emp in data['employees']: cursor.execute(''' INSERT INTO employees (name, age, department) VALUES (?, ?, ?) ''', (emp['name'], emp.get('age'), emp.get('department'))) cursor.execute('SELECT * FROM employees') rows = cursor.fetchall() print(rows)
Output:
[(1, 'Ahmed', 30, 'Engineering'), (2, 'Fatima', 28, 'Marketing')]
In this code, the YAML anchor &default_employee
defines default properties.
The alias *default_employee
inherits these properties in the first employee entry.
The yaml.safe_load
method processes the anchors and aliases and yields a complete dictionary for each employee.
Then we insert the employees into the SQL table.
Insert YAML Sequence Data
You can insert YAML sequences directly into array fields if you’re working with a database like PostgreSQL that supports array data types.
import yaml import psycopg2 from psycopg2.extras import execute_values yaml_data = """ employees: - name: Ahmed skills: - Python - SQL - Git - name: Fatima skills: - Marketing - SEO - Content Writing """ data = yaml.safe_load(yaml_data) conn = psycopg2.connect( host="localhost", database="mydb", user="root", password="123" ) cursor = conn.cursor() # Create table with array type cursor.execute(''' CREATE TABLE IF NOT EXISTS employees ( id SERIAL PRIMARY KEY, name TEXT, skills TEXT[] ) ''') conn.commit() # Insert data into table employee_list = [ (emp['name'], emp['skills']) for emp in data['employees'] ] query = ''' INSERT INTO employees (name, skills) VALUES %s ''' execute_values(cursor, query, employee_list) conn.commit() cursor.execute('SELECT * FROM employees') rows = cursor.fetchall() for row in rows: print(row)
Output:
(1, 'Ahmed', ['Python', 'SQL', 'Git']) (2, 'Fatima', ['Marketing', 'SEO', 'Content Writing'])
This code connects to a PostgreSQL database and creates an employees
table with a skills
column of type TEXT[]
, which is an array of text.
Then we load the YAML data, extract each employee’s name and list of skills, and insert them into the table using execute_values
.
Finally, we retrieve and print the data.
Mokhtar is the founder of LikeGeeks.com. He is a seasoned technologist and accomplished author, with expertise in Linux system administration and Python development. Since 2010, Mokhtar has built an impressive career, transitioning from system administration to Python development in 2015. His work spans large corporations to freelance clients around the globe. Alongside his technical work, Mokhtar has authored some insightful books in his field. Known for his innovative solutions, meticulous attention to detail, and high-quality work, Mokhtar continually seeks new challenges within the dynamic field of technology.