JSON vs. YAML (Performance Comparison)
In this tutorial, we’ll focus on various performance aspects between JSON and YAML using Python.
We’ll explore parsing speeds, nested structure performance, and serialization speeds.
Parsing Benchmark Test
To compare parsing speeds, you can measure the time it takes to load data from JSON and YAML formats using the timeit
module.
import json import timeit from ruamel.yaml import YAML json_data = ''' { "users": [ {"id": 1, "name": "Ahmed", "age": 30}, {"id": 2, "name": "Fatima", "age": 25}, {"id": 3, "name": "Hassan", "age": 27} ] } ''' yaml_data = ''' users: - id: 1 name: Ahmed age: 30 - id: 2 name: Fatima age: 25 - id: 3 name: Hassan age: 27 ''' def parse_json(): json.loads(json_data) def parse_yaml(): yaml = YAML(typ='safe') yaml.load(yaml_data) json_time = timeit.timeit(parse_json, number=10000) yaml_time = timeit.timeit(parse_yaml, number=10000) print(f"JSON parsing time: {json_time:.6f} seconds") print(f"YAML parsing time: {yaml_time:.6f} seconds")
Output:
JSON parsing time: 0.024818 seconds YAML parsing time: 4.286867 seconds
JSON parsing is significantly faster than YAML parsing.
Nested Data Performance
To compare parsing of deeply nested data structures, you can generate nested JSON and YAML data and measure parsing times.
import json import timeit from ruamel.yaml import YAML json_data = ''' { "users": [ {"id": 1, "name": "Ahmed", "age": 30}, {"id": 2, "name": "Fatima", "age": 25}, {"id": 3, "name": "Hassan", "age": 27} ] } ''' yaml_data = ''' users: - id: 1 name: Ahmed age: 30 - id: 2 name: Fatima age: 25 - id: 3 name: Hassan age: 27 ''' def parse_json(): json.loads(json_data) def parse_yaml(): yaml = YAML(typ='safe') yaml.load(yaml_data) json_time = timeit.timeit(parse_json, number=10000) yaml_time = timeit.timeit(parse_yaml, number=10000) print(f"JSON parsing time: {json_time:.6f} seconds") print(f"YAML parsing time: {yaml_time:.6f} seconds")
Output:
JSON parsing time: 0.024721 seconds YAML parsing time: 4.423315 seconds
Parsing deeply nested JSON data is still much faster than parsing YAML data.
Serialization (Dumping) Speed
To compare serialization speeds, you can measure the time it takes to dump Python data structures into JSON and YAML formats.
import json import timeit from ruamel.yaml import YAML from io import StringIO data = { "users": [ {"id": 1, "name": "Aisha", "age": 30}, {"id": 2, "name": "Omar", "age": 25}, {"id": 3, "name": "Heba", "age": 27} ] } def dump_json(): json.dumps(data) def dump_yaml(): yaml = YAML() stream = StringIO() yaml.dump(data, stream) json_dump_time = timeit.timeit(dump_json, number=10000) yaml_dump_time = timeit.timeit(dump_yaml, number=10000) print(f"JSON dumping time: {json_dump_time:.6f} seconds") print(f"YAML dumping time: {yaml_dump_time:.6f} seconds")
Output:
JSON dumping time: 0.034363 seconds YAML dumping time: 9.948535 seconds
Dumping data to JSON is much faster than dumping to YAML.
Mokhtar is the founder of LikeGeeks.com. He is a seasoned technologist and accomplished author, with expertise in Linux system administration and Python development. Since 2010, Mokhtar has built an impressive career, transitioning from system administration to Python development in 2015. His work spans large corporations to freelance clients around the globe. Alongside his technical work, Mokhtar has authored some insightful books in his field. Known for his innovative solutions, meticulous attention to detail, and high-quality work, Mokhtar continually seeks new challenges within the dynamic field of technology.