Export MultiIndex DataFrame to HTML using Pandas to_html

In this tutorial, we’ll learn how to export multi-indexed DataFrame to HTML format using Pandas.

We’ll start by exploring the basic export of a multi-index DataFrame, moving towards custom formatting, adding interactive JavaScript elements, and embedding images or links.

 

 

Export MultiIndex DataFrame to HTML

To export a multi-index DataFrame to HTML in Python using Pandas, you begin by creating a DataFrame with a multi-index.

import pandas as pd
data = {
    "Region": ["North", "North", "South", "South"],
    "Service": ["Voice", "Data", "Voice", "Data"],
    "Usage": [340, 120, 450, 300],
    "Customers": [30, 45, 25, 40]
}
df = pd.DataFrame(data)
df.set_index(["Region", "Service"], inplace=True)
print(df)

After running this code, your DataFrame df looks like this:

                Usage  Customers
Region Service                  
North  Voice     340         30
       Data      120         45
South  Voice     450         25
       Data      300         40

Now, let’s export this DataFrame to an HTML file using to_html() function:

html_data = df.to_html()
with open("telecom_data.html", "w") as file:
    file.write(html_data)

This code converts the DataFrame into HTML format, which can be directly viewed in a web browser or integrated into a web application.

 

Change Font & Color

Let’s customize the appearance of our DataFrame by incorporating HTML and CSS for styling:

html_data = df.to_html()
html_data = html_data.replace('<table border="1" class="dataframe">',
                              '<table border="1" class="dataframe" style="font-family: Arial; color: blue;">')

In this example, the code replaces the default <table> tag with one that includes a style attribute.

This attribute changes the font to Arial and sets the text color to blue.

For more complex styling, you might want to use external CSS.

Create a CSS file (e.g., style.css) with your desired styles:

.dataframe {
    font-family: Arial;
    color: blue;
}
th {
    background-color: #f2f2f2;
}

Link this CSS file in your HTML:

html_data_with_css = '<link rel="stylesheet" type="text/css" href="style.css"/>' + html_data

 

Table Attributes

Customizing table attributes such as border, cell spacing/cell padding, and class name is essential for enhancing the visual appeal and readability of your HTML table.

Setting Cell spacing and Cell padding

While Pandas to_html method doesn’t directly support cell spacing and cell padding attributes, you can achieve this by manipulating the HTML string:

html_data = df.to_html()
html_data = html_data.replace('<table border="1" class="dataframe">', '<table border="1" cellspacing="5" cellpadding="5">')

This replacement adds cell spacing and cell padding to your table.

Adding a Class Name

Adding a class name to your table can be useful for applying CSS styles:

html_data = df.to_html()
html_data = html_data.replace('<table border="1" class="dataframe">', '<table border="1" class="myCustomClass">')

With this class name, you can now use CSS to style your table.

 

Cell Formatting

Formatting the cell values in a DataFrame, such as floating-point numbers or dates, is crucial for making your data presentation clear and concise.

Formatting Floats

Pandas allows you to format floating-point numbers with the float_format parameter. For instance, you can limit the number of decimal places:

html_data = df.to_html(float_format="{:.2f}".format)

This code will format all floating-point numbers in your DataFrame to have two decimal places, making the data cleaner and more uniform.

Formatting Dates

While the to_html method doesn’t directly offer a date formatting option, you can format dates in the DataFrame before converting it to HTML:

df['DateColumn'] = pd.to_datetime(df['DateColumn']).dt.strftime('%Y-%m-%d')
html_data = df.to_html()

In this example, the strftime method is used to format the date in a ‘YYYY-MM-DD’ format.

 

Expanding Levels

To control the expansion of MultiIndex levels, you can reset the index of your DataFrame before exporting it.

This will convert the MultiIndex into regular columns:

# Reset index to convert MultiIndex into columns
df_reset = df.reset_index()
html_data = df_reset.to_html()

This change affects how they are represented in the HTML table.

Alternatively, you may want to keep certain levels of the MultiIndex as index while expanding others into columns.

This can be done by resetting the index partially:

df_partial_reset = df.reset_index(level='Service')
html_data = df_partial_reset.to_html()

In this example, only the ‘Service’ level is reset, turning it into a column, while ‘Region’ remains as an index level.

 

Collapsing Levels

Pandas doesn’t directly provide a method to collapse inner levels in the to_html function.

However, you can achieve this by modifying the DataFrame itself.

One common approach is to use the groupby method combined with aggregation functions:

collapsed_df = df.groupby(level='Region').sum()
html_data = collapsed_df.to_html()

In this example, the inner level ‘Service’ is collapsed, and the data is aggregated by the ‘Region’ level.

If you need more control over how the levels are collapsed, you can create a custom aggregation:

custom_collapsed_df = df.groupby(level='Region').agg({'Usage': 'mean', 'Customers': 'sum'})
html_data = custom_collapsed_df.to_html()

Here, the DataFrame is grouped by ‘Region’, but with different aggregations for ‘Usage’ and ‘Customers’.

 

Including Header and Footer

You can add a header to your HTML table by manually constructing the HTML string:

header_html = "<h2>Telecom Usage Statistics</h2><p>Data Overview</p>"
df_html = df.to_html()
html_data_with_header = header_html + df_html

This code creates a custom header with a title and a brief description, which is then concatenated with the HTML string of the DataFrame.

Similarly, you can add a footer to provide additional information or notes:

footer_html = "<p>Report generated by: Your Company Name</p>"
html_data_with_footer = html_data + footer_html

 

Integrating JavaScript

Let’s see how you can integrate JavaScript and interactive elements into the HTML output.

Highlight Row When Hovered Over

You can start by adding simple JavaScript functionalities. For instance, adding a script to highlight a row when hovered over:

# Define a JavaScript function for row highlighting
script = """
<script>
    function highlightRow(row) {
        row.style.backgroundColor = row.style.backgroundColor === 'yellow' ? '' : 'yellow';
    }
</script>
"""

# Attach event handler to each row
html_data = html_data.replace('<tr>', '<tr onmouseover="highlightRow(this)">')
html_data = html_data + script

This code attaches a JavaScript function to each row of the table, which changes the background color of the row when hovered over.

Adding Interactive Elements

For more advanced interactivity, you can integrate external JavaScript libraries like DataTables. DataTables can enhance your HTML tables with features like sorting, paging, and searching:

df_reset = df.reset_index()
html_data = df_reset.to_html()
html_data = html_data.replace('<table border="1" class="dataframe">', '<table id="myTable">')

datatables_import = """
<link rel="stylesheet" type="text/css" href="https://cdn.datatables.net/1.13.7/css/jquery.dataTables.min.css">
<script type="text/javascript" charset="utf8" src="https://code.jquery.com/jquery-3.7.1.min.js"></script>
<script type="text/javascript" charset="utf8" src="https://cdn.datatables.net/1.13.7/js/jquery.dataTables.min.js"></script>
"""

# Initialize DataTables on the table
init_script = """
<script>
$(document).ready( function () {
    let table = $('#myTable').DataTable();
} );
</script>
"""
html_data = datatables_import + html_data + init_script

This code integrates DataTables, which automatically adds functionalities like pagination and sorting to your table.

 

Embedding Links or Images

Let’s explore how to embed images and links in the HTML output.

Embedding Hyperlinks

To add hyperlinks, you’ll need to modify the DataFrame’s cells to contain HTML anchor (<a>) tags.

For example, if you want to link each ‘Region’ to an external URL:

import pandas as pd
data = {
    "Region": ["North", "North", "South", "South"],
    "Service": ["Voice", "Data", "Voice", "Data"],
    "Usage": [340, 120, 450, 300],
    "Customers": [30, 45, 25, 40]
}
df = pd.DataFrame(data)
df['Region'] = df['Region'].apply(lambda x: f'<a href="https://example.com/{x}">{x}</a>')
df.set_index(["Region", "Service"], inplace=True)
html_data = df.to_html(escape=False)
with open("telecom_data.html", "w") as file:
    file.write(html_data)

In this code, a new column ‘Region_Link’ is created, where each ‘Region’ value is converted into a clickable link.

The escape=False parameter ensures that HTML tags in the DataFrame are correctly interpreted as HTML and not as plain text.

Embedding Images

Similarly, to embed images, you insert <img> tags into DataFrame cells:

import pandas as pd
data = {
    "Region": ["North", "North", "South", "South"],
    "Service": ["Voice", "Data", "Voice", "Data"],
    "Usage": [340, 120, 450, 300],
    "Customers": [30, 45, 25, 40]
}
df = pd.DataFrame(data)
df['Service_Image'] = df['Service'].apply(lambda x: f'<img src="https://example.com/images/{x}.png" alt="{x}"/>')
df.set_index(["Region", "Service"], inplace=True)
html_data = df.to_html(escape=False)
with open("telecom_data.html", "w") as file:
    file.write(html_data)

Here, each ‘Service’ is associated with an image. The src attribute in the <img> tag points to the URL of the image.

Leave a Reply

Your email address will not be published. Required fields are marked *