Export Excel to XML using Python Pandas
In this tutorial, you’ll learn how to export data from Microsoft Excel files to XML format using Pandas in Python.
From handling single-sheet Excel files to dealing with multi-sheet workbooks, customizing XML structure, setting specific encoding, and more.
- 1 Convert a Single Sheet Excel File to XML
- 2 Export Multiple Sheets to Separate XML Files
- 3 Convert Specific Columns to XML
- 4 Specify Excel Columns
- 5 Export Excel Data with Custom Root and Row Tags
- 6 Exclude Index
- 7 Export with Custom Header
- 8 Convert with Namespace Declarations
- 9 Export with Custom Encoding
- 10 Convert Excel Data to Pretty-Printed XML
Convert a Single Sheet Excel File to XML
The Pandas read_excel()
function reads the Excel file, and to_xml()
function exports the data to an XML format.
import pandas as pd df = pd.read_excel("single_sheet.xlsx") xml_data = df.to_xml()
Export Multiple Sheets to Separate XML Files
You can export multiple sheets by specifying the sheet name to the sheet_name
parameter:
import pandas as pd xlsx = pd.ExcelFile("multi_sheet.xlsx") for sheet_name in xlsx.sheet_names: df = pd.read_excel(xlsx, sheet_name) xml_data = df.to_xml() print(f"XML data for sheet: {sheet_name}\n", xml_data)
Convert Specific Columns to XML
You can export specific Excel columns by specifying the columns to the usecols
parameter:
import pandas as pd # Reading specific columns from an Excel file df = pd.read_excel("data.xlsx", usecols=['Column1', 'Column2']) xml_data = df.to_xml()
Specify Excel Columns
Another way of specifying columns is to use attr_cols
parameter to specify the columns you want to export:
import pandas as pd df = pd.read_excel("data.xlsx") xml_data = df.to_xml(index=False, attr_cols=df.columns)
Here we specified the columns from the XML after reading the Excel file, unlike the previous method we specified the columns from the beginning.
Export Excel Data with Custom Root and Row Tags
You can use the root_name
and row_name
parameters to use custom root name and row name respectively:
import pandas as pd df = pd.read_excel("data.xlsx") # Exporting to XML with custom root and row tags xml_data = df.to_xml(root_name='RootTag', row_name='RowTag')
Exclude Index
To exclude the index from the export process, you can set the index
parameter to False:
import pandas as pd df = pd.read_excel("data.xlsx") xml_data = df.to_xml(index=False)
Export with Custom Header
To set a custom header, you can use the header
parameter like this:
import pandas as pd df = pd.read_excel("data.xlsx") custom_header = '<?xml version="1.0" encoding="UTF-8"?>\n<!-- Custom Header -->' xml_data = df.to_xml(header=custom_header)
Convert with Namespace Declarations
You can set the namespace by setting the namespace
parameter:
import pandas as pd df = pd.read_excel("data.xlsx") namespaces = {'xmlns:xsi': 'http://www.w3.org/2001/XMLSchema-instance'} xml_data = df.to_xml(namespaces=namespaces)
Export with Custom Encoding
The encoding
parameter allows you to set the encoding you want for the exported XML file:
import pandas as pd df = pd.read_excel("data.xlsx") xml_data = df.to_xml(encoding='utf-16')
Convert Excel Data to Pretty-Printed XML
YOu can use the pretty_print
parameter to output the XML printed with spaces and adjusted for reading:
import pandas as pd df = pd.read_excel("data.xlsx") xml_data = df.to_xml(pretty_print=True)
Mokhtar is the founder of LikeGeeks.com. He is a seasoned technologist and accomplished author, with expertise in Linux system administration and Python development. Since 2010, Mokhtar has built an impressive career, transitioning from system administration to Python development in 2015. His work spans large corporations to freelance clients around the globe. Alongside his technical work, Mokhtar has authored some insightful books in his field. Known for his innovative solutions, meticulous attention to detail, and high-quality work, Mokhtar continually seeks new challenges within the dynamic field of technology.