Export XML to Dictionary using Python Pandas
In this tutorial, we’ll learn how to export XML data to a dictionary using Pandas in Python.
We’ll use pandas.read_xml()
function to read XML and pandas.to_dict()
function to export the data to a dictionary.
Basic XML to Dictionary
In this example, we have a simple XML file containing customer data.
Sample XML Content:
<Customers> <Customer> <ID>1</ID> <Name>Customer A</Name> <Email>customerA@email.com</Email> </Customer> <Customer> <ID>2</ID> <Name>Customer B</Name> <Email>customerB@email.com</Email> </Customer> </Customers>
Code:
import pandas as pd df = pd.read_xml('customers.xml') customer_dict = df.to_dict(orient='records') print(customer_dict)
Output:
[{'ID': 1, 'Name': 'Customer A', 'Email': 'customerA@email.com'}, {'ID': 2, 'Name': 'Customer B', 'Email': 'customerB@email.com'}]
Nested XML Elements
Here, the XML file includes nested elements for service plans under each customer.
Sample XML Content:
<Customers> <Customer> <ID>1</ID> <Name>Customer A</Name> <Plans> <Plan>Plan A</Plan> <Plan>Plan B</Plan> </Plans> </Customer> <Customer> <ID>2</ID> <Name>Customer B</Name> <Plans> <Plan>Plan C</Plan> </Plans> </Customer> </Customers>
Code:
import xml.etree.ElementTree as ET import pandas as pd tree = ET.parse('customers_plans.xml') root = tree.getroot() df = pd.DataFrame(columns=['ID', 'Name', 'Plans']) for i, customer in enumerate(root): row = {} for child in customer: if child.tag == 'Plans': plans = [] for plan in child: plans.append(plan.text) row[child.tag] = plans else: row[child.tag] = child.text df.loc[i] = row customer_plan_dict = df.to_dict(orient='records') print(customer_plan_dict)
Output:
[{'ID': '1', 'Name': 'Customer A', 'Plans': ['Plan A', 'Plan B']}, {'ID': '2', 'Name': 'Customer B', 'Plans': ['Plan C']}]
XML with Attributes
In this scenario, the XML file includes attributes within elements, such as plan type and status.
Sample XML Content:
<Customers> <Customer ID="1" Status="Active"> <Name>Customer A</Name> <Plan>Plan A</Plan> </Customer> <Customer ID="2" Status="Inactive"> <Name>Customer B</Name> <Plan>Plan B</Plan> </Customer> </Customers>
Code:
import pandas as pd df = pd.read_xml('customers_attributes.xml', xpath='//Customer') customer_attributes_dict = df.to_dict(orient='records') print(customer_attributes_dict)
Output:
[{'ID': 1, 'Status': 'Active', 'Name': 'Customer A', 'Plan': 'Plan A'}, {'ID': 2, 'Status': 'Inactive', 'Name': 'Customer B', 'Plan': 'Plan B'}]
Mokhtar is the founder of LikeGeeks.com. He is a seasoned technologist and accomplished author, with expertise in Linux system administration and Python development. Since 2010, Mokhtar has built an impressive career, transitioning from system administration to Python development in 2015. His work spans large corporations to freelance clients around the globe. Alongside his technical work, Mokhtar has authored some insightful books in his field. Known for his innovative solutions, meticulous attention to detail, and high-quality work, Mokhtar continually seeks new challenges within the dynamic field of technology.