Export Excel to XML using Python Pandas

In this tutorial, you’ll learn how to export data from Microsoft Excel files to XML format using Pandas in Python.

From handling single-sheet Excel files to dealing with multi-sheet workbooks, customizing XML structure, setting specific encoding, and more.

 

 

Convert a Single Sheet Excel File to XML

The Pandas read_excel() function reads the Excel file, and to_xml() function exports the data to an XML format.

import pandas as pd
df = pd.read_excel("single_sheet.xlsx")
xml_data = df.to_xml()

 

Export Multiple Sheets to Separate XML Files

You can export multiple sheets by specifying the sheet name to the sheet_name parameter:

import pandas as pd
xlsx = pd.ExcelFile("multi_sheet.xlsx")
for sheet_name in xlsx.sheet_names:
    df = pd.read_excel(xlsx, sheet_name)
    xml_data = df.to_xml()
    print(f"XML data for sheet: {sheet_name}\n", xml_data)

 

Convert Specific Columns to XML

You can export specific Excel columns by specifying the columns to the usecols parameter:

import pandas as pd

# Reading specific columns from an Excel file
df = pd.read_excel("data.xlsx", usecols=['Column1', 'Column2'])
xml_data = df.to_xml()

 

Specify Excel Columns

Another way of specifying columns is to use attr_cols parameter to specify the columns you want to export:

import pandas as pd
df = pd.read_excel("data.xlsx")
xml_data = df.to_xml(index=False, attr_cols=df.columns)

Here we specified the columns from the XML after reading the Excel file, unlike the previous method we specified the columns from the beginning.

 

Export Excel Data with Custom Root and Row Tags

You can use the root_name and row_name parameters to use custom root name and row name respectively:

import pandas as pd
df = pd.read_excel("data.xlsx")

# Exporting to XML with custom root and row tags
xml_data = df.to_xml(root_name='RootTag', row_name='RowTag')

 

Exclude Index

To exclude the index from the export process, you can set the index parameter to False:

import pandas as pd
df = pd.read_excel("data.xlsx")
xml_data = df.to_xml(index=False)

 

Export with Custom Header

To set a custom header, you can use the header parameter like this:

import pandas as pd
df = pd.read_excel("data.xlsx")
custom_header = '<?xml version="1.0" encoding="UTF-8"?>\n<!-- Custom Header -->'
xml_data = df.to_xml(header=custom_header)

 

Convert with Namespace Declarations

You can set the namespace by setting the namespace parameter:

import pandas as pd
df = pd.read_excel("data.xlsx")
namespaces = {'xmlns:xsi': 'http://www.w3.org/2001/XMLSchema-instance'}
xml_data = df.to_xml(namespaces=namespaces)

 

Export with Custom Encoding

The encoding parameter allows you to set the encoding you want for the exported XML file:

import pandas as pd
df = pd.read_excel("data.xlsx")
xml_data = df.to_xml(encoding='utf-16')

 

Convert Excel Data to Pretty-Printed XML

YOu can use the pretty_print parameter to output the XML printed with spaces and adjusted for reading:

import pandas as pd
df = pd.read_excel("data.xlsx")
xml_data = df.to_xml(pretty_print=True)
Leave a Reply

Your email address will not be published. Required fields are marked *