Export MultiIndex DataFrame to HTML using Pandas to_html
In this tutorial, we’ll learn how to export multi-indexed DataFrame to HTML format using Pandas.
We’ll start by exploring the basic export of a multi-index DataFrame, moving towards custom formatting, adding interactive JavaScript elements, and embedding images or links.
Export MultiIndex DataFrame to HTML
To export a multi-index DataFrame to HTML in Python using Pandas, you begin by creating a DataFrame with a multi-index.
import pandas as pd data = { "Region": ["North", "North", "South", "South"], "Service": ["Voice", "Data", "Voice", "Data"], "Usage": [340, 120, 450, 300], "Customers": [30, 45, 25, 40] } df = pd.DataFrame(data) df.set_index(["Region", "Service"], inplace=True) print(df)
After running this code, your DataFrame df
looks like this:
Usage Customers Region Service North Voice 340 30 Data 120 45 South Voice 450 25 Data 300 40
Now, let’s export this DataFrame to an HTML file using to_html()
function:
html_data = df.to_html() with open("telecom_data.html", "w") as file: file.write(html_data)
This code converts the DataFrame into HTML format, which can be directly viewed in a web browser or integrated into a web application.
Change Font & Color
Let’s customize the appearance of our DataFrame by incorporating HTML and CSS for styling:
html_data = df.to_html() html_data = html_data.replace('<table border="1" class="dataframe">', '<table border="1" class="dataframe" style="font-family: Arial; color: blue;">')
In this example, the code replaces the default <table>
tag with one that includes a style
attribute.
This attribute changes the font to Arial and sets the text color to blue.
For more complex styling, you might want to use external CSS.
Create a CSS file (e.g., style.css
) with your desired styles:
.dataframe { font-family: Arial; color: blue; } th { background-color: #f2f2f2; }
Link this CSS file in your HTML:
html_data_with_css = '<link rel="stylesheet" type="text/css" href="style.css"/>' + html_data
Table Attributes
Customizing table attributes such as border, cell spacing/cell padding, and class name is essential for enhancing the visual appeal and readability of your HTML table.
Setting Cell spacing and Cell padding
While Pandas to_html
method doesn’t directly support cell spacing and cell padding attributes, you can do this by manipulating the HTML string:
html_data = df.to_html() html_data = html_data.replace('<table border="1" class="dataframe">', '<table border="1" cellspacing="5" cellpadding="5">')
This replacement adds cell spacing and cell padding to your table.
Adding a Class Name
Adding a class name to your table can be useful for applying CSS styles:
html_data = df.to_html() html_data = html_data.replace('<table border="1" class="dataframe">', '<table border="1" class="myCustomClass">')
With this class name, you can now use CSS to style your table.
Cell Formatting
Formatting the cell values in a DataFrame, such as floating-point numbers or dates, is crucial for making your data presentation clear and concise.
Formatting Floats
Pandas allows you to format floating-point numbers with the float_format
parameter. For instance, you can limit the number of decimal places:
html_data = df.to_html(float_format="{:.2f}".format)
This code will format all floating-point numbers in your DataFrame to have two decimal places, making the data cleaner and more uniform.
Formatting Dates
While the to_html
method doesn’t directly offer a date formatting option, you can format dates in the DataFrame before converting it to HTML:
df['DateColumn'] = pd.to_datetime(df['DateColumn']).dt.strftime('%Y-%m-%d') html_data = df.to_html()
In this example, the strftime
method is used to format the date in a ‘YYYY-MM-DD’ format.
Expanding Levels
To control the expansion of MultiIndex levels, you can reset the index of your DataFrame before exporting it.
This will convert the MultiIndex into regular columns:
# Reset index to convert MultiIndex into columns df_reset = df.reset_index() html_data = df_reset.to_html()
This change affects how they are represented in the HTML table.
Alternatively, you may want to keep certain levels of the MultiIndex as index while expanding others into columns.
This can be done by resetting the index partially:
df_partial_reset = df.reset_index(level='Service') html_data = df_partial_reset.to_html()
In this example, only the ‘Service’ level is reset, turning it into a column, while ‘Region’ remains as an index level.
Collapsing Levels
Pandas doesn’t directly provide a method to collapse inner levels in the to_html
function.
However, you can do this by modifying the DataFrame itself.
One common approach is to use the groupby
method combined with aggregation functions:
collapsed_df = df.groupby(level='Region').sum() html_data = collapsed_df.to_html()
In this example, the inner level ‘Service’ is collapsed, and the data is aggregated by the ‘Region’ level.
If you need more control over how the levels are collapsed, you can create a custom aggregation:
custom_collapsed_df = df.groupby(level='Region').agg({'Usage': 'mean', 'Customers': 'sum'}) html_data = custom_collapsed_df.to_html()
Here, the DataFrame is grouped by ‘Region’, but with different aggregations for ‘Usage’ and ‘Customers’.
You can add a header to your HTML table by manually constructing the HTML string:
header_html = "<h2>Telecom Usage Statistics</h2><p>Data Overview</p>" df_html = df.to_html() html_data_with_header = header_html + df_html
This code creates a custom header with a title and a brief description, which is then concatenated with the HTML string of the DataFrame.
Similarly, you can add a footer to provide additional information or notes:
footer_html = "<p>Report generated by: Your Company Name</p>" html_data_with_footer = html_data + footer_html
Integrating JavaScript
Let’s see how you can integrate JavaScript and interactive elements into the HTML output.
Highlight Row When Hovered Over
You can start by adding simple JavaScript functionalities. For instance, adding a script to highlight a row when hovered over:
# Define a JavaScript function for row highlighting script = """ <script> function highlightRow(row) { row.style.backgroundColor = row.style.backgroundColor === 'yellow' ? '' : 'yellow'; } </script> """ # Attach event handler to each row html_data = html_data.replace('<tr>', '<tr onmouseover="highlightRow(this)">') html_data = html_data + script
This code attaches a JavaScript function to each row of the table, which changes the background color of the row when hovered over.
Adding Interactive Elements
For more advanced interactivity, you can integrate external JavaScript libraries like DataTables. DataTables can enhance your HTML tables with features like sorting, paging, and searching:
df_reset = df.reset_index() html_data = df_reset.to_html() html_data = html_data.replace('<table border="1" class="dataframe">', '<table id="myTable">') datatables_import = """ <link rel="stylesheet" type="text/css" href="https://cdn.datatables.net/1.13.7/css/jquery.dataTables.min.css"> <script type="text/javascript" charset="utf8" src="https://code.jquery.com/jquery-3.7.1.min.js"></script> <script type="text/javascript" charset="utf8" src="https://cdn.datatables.net/1.13.7/js/jquery.dataTables.min.js"></script> """ # Initialize DataTables on the table init_script = """ <script> $(document).ready( function () { let table = $('#myTable').DataTable(); } ); </script> """ html_data = datatables_import + html_data + init_script
This code integrates DataTables, which automatically adds functionalities like pagination and sorting to your table.
Embedding Links or Images
Let’s explore how to embed images and links in the HTML output.
Embedding Hyperlinks
To add hyperlinks, you’ll need to modify the DataFrame’s cells to contain HTML anchor (<a>
) tags.
For example, if you want to link each ‘Region’ to an external URL:
import pandas as pd data = { "Region": ["North", "North", "South", "South"], "Service": ["Voice", "Data", "Voice", "Data"], "Usage": [340, 120, 450, 300], "Customers": [30, 45, 25, 40] } df = pd.DataFrame(data) df['Region'] = df['Region'].apply(lambda x: f'<a href="https://example.com/{x}">{x}</a>') df.set_index(["Region", "Service"], inplace=True) html_data = df.to_html(escape=False) with open("telecom_data.html", "w") as file: file.write(html_data)
In this code, a new column ‘Region_Link’ is created, where each ‘Region’ value is converted into a clickable link.
The escape=False
parameter ensures that HTML tags in the DataFrame are correctly interpreted as HTML and not as plain text.
Embedding Images
Similarly, to embed images, you insert <img>
tags into DataFrame cells:
import pandas as pd data = { "Region": ["North", "North", "South", "South"], "Service": ["Voice", "Data", "Voice", "Data"], "Usage": [340, 120, 450, 300], "Customers": [30, 45, 25, 40] } df = pd.DataFrame(data) df['Service_Image'] = df['Service'].apply(lambda x: f'<img src="https://example.com/images/{x}.png" alt="{x}"/>') df.set_index(["Region", "Service"], inplace=True) html_data = df.to_html(escape=False) with open("telecom_data.html", "w") as file: file.write(html_data)
Here, each ‘Service’ is associated with an image. The src
attribute in the <img>
tag points to the URL of the image.
Mokhtar is the founder of LikeGeeks.com. He is a seasoned technologist and accomplished author, with expertise in Linux system administration and Python development. Since 2010, Mokhtar has built an impressive career, transitioning from system administration to Python development in 2015. His work spans large corporations to freelance clients around the globe. Alongside his technical work, Mokhtar has authored some insightful books in his field. Known for his innovative solutions, meticulous attention to detail, and high-quality work, Mokhtar continually seeks new challenges within the dynamic field of technology.