Export SQL to Excel using Python Pandas to_excel

In this tutorial, you’ll learn how to export SQL databases or files to Excel using Pandas to_excel().

You’ll learn how to connect to an SQL database, query data, and export it into an Excel file using Python’s Pandas library.

 

 

Establishing SQL Database Connections

To export data from SQL to Excel, the first step is establishing a connection to your SQL database.

This process varies depending on the SQL database system you are using.

We will cover the three popular databases: SQLite, PostgreSQL, and MySQL.

Python, with its libraries like sqlite3, psycopg2, and mysql-connector-python, makes connecting to these databases straightforward.

SQLite Connection

SQLite is a lightweight, disk-based database. It doesn’t require a separate server process, making it a popular choice for development and testing.

import sqlite3
conn = sqlite3.connect('sample_data.db')
print("Connected to SQLite.")

Output:

Connected to SQLite.

In this snippet, we connect to an SQLite database named ‘sample_data.db’. If this file doesn’t exist, SQLite will create it.

PostgreSQL Connection

PostgreSQL is a powerful, open-source object-relational database system.

import psycopg2
conn = psycopg2.connect(
    host="localhost",
    database="sample_data",
    user="username",
    password="password")
print("Connected to PostgreSQL.")

Output:

Connected to PostgreSQL.

Here, we connect to a PostgreSQL database. Ensure you have the correct host, database, user, and password.

MySQL Connection

MySQL is another widely used open-source relational database management system.

import mysql.connector
conn = mysql.connector.connect(
    host="localhost",
    database="sample_data",
    user="username",
    password="password")
print("Connected to MySQL.")

Output:

Connected to MySQL.

This code snippet demonstrates how to establish a connection with a MySQL database.

 

Retrieving Data from SQL Databases

Once you have established a connection to your SQL database, the next step is to retrieve the data you need using read_sql function which allows you to execute a SQL query and load the data into a DataFrame.

This process is similar across SQLite, PostgreSQL, and MySQL, with differences in the SQL syntax and connection methods.

Querying Data from SQLite

Assuming you have already connected to your SQLite database as shown in the previous section, here’s how you can retrieve data:

import pandas as pd
query = "SELECT * FROM customers"
data_frame = pd.read_sql(query, conn)
print(data_frame.head())

Output:

   id      name    age
0   1   Angela     30
1   2     John     27
2   3  Vanessa     35
3   4    Luis      40
4   5   Sophie     28

This code snippet executes a SQL query to select all records from the ‘customers’ table and loads them into a DataFrame. data_frame.head() displays the first five rows.

Querying Data from PostgreSQL

For PostgreSQL, the process remains the same. Ensure you’re connected to your PostgreSQL database:

query = "SELECT * FROM orders"
data_frame = pd.read_sql(query, conn)
print(data_frame.head())

Output:

   order_id  product_id  quantity
0       101          30         2
1       102          14         1
2       103          73         5
3       104          58         3
4       105          91         2

Here, we retrieve data from the ‘orders’ table in a PostgreSQL database.

Querying Data from MySQL

Similarly, for MySQL:

query = "SELECT * FROM sales"
data_frame = pd.read_sql(query, conn)
print(data_frame.head())

Output:

   sale_id  customer_id  amount
0        1          100   500.0
1        2          101   300.0
2        3          102   450.0
3        4          103   600.0
4        5          104   350.0

This snippet fetches data from a MySQL database’s ‘sales’ table.

 

Export to Excel

After retrieving data from your SQL database and storing it in a Pandas DataFrame, the next step is to export this data to an Excel file using to_excel method.

This method allows you to write DataFrame data to an Excel file, offering various parameters to customize the output.

Here’s how you can export your DataFrame to an Excel file:

excel_file_path = 'exported_data.xlsx'
data_frame.to_excel(excel_file_path, index=False)
print("Data exported to Excel file successfully.")

Output:

Data exported to Excel file successfully.

In this code, data_frame.to_excel(excel_file_path, index=False) exports the data from the DataFrame data_frame to an Excel file named ‘exported_data.xlsx’.

The index=False parameter is used to prevent Pandas from writing row indices into the Excel file.

If you need the row indices in your Excel file, you can remove this parameter.

 

Export Specific Columns from SQL Query

You can modify the SQL query to select only the columns you need. For example, if you only want the name and age columns from a customers table, your SQL query would be:

query = "SELECT name, age FROM customers"
data_frame = pd.read_sql(query, conn)
print(data_frame.head())

Output:

       name  age
0    Angela   30
1      John   27
2   Vanessa   35
3     Luis    40
4    Sophie   28

After selecting the specific columns and loading them into a DataFrame, you can export them to Excel using the same to_excel method:

data_frame.to_excel('selected_columns_data.xlsx', index=False)
print("Selected columns exported to Excel successfully.")

Output:

Selected columns exported to Excel successfully.

 

Exporting Multiple SQL Tables

ExcelWriter in Pandas allows you to write multiple DataFrames to different sheets of an Excel file by specifying the name of the sheet using the sheet_name parameter.

Here’s an example demonstrating how to export data from two tables, customers and orders, into separate sheets:

with pd.ExcelWriter('multiple_tables.xlsx') as writer:
    # Query and export the first table
    query_customers = "SELECT * FROM customers"
    customers_df = pd.read_sql(query_customers, conn)
    customers_df.to_excel(writer, sheet_name='Customers', index=False)

    # Query and export the second table
    query_orders = "SELECT * FROM orders"
    orders_df = pd.read_sql(query_orders, conn)
    orders_df.to_excel(writer, sheet_name='Orders', index=False)
print("Multiple tables exported to separate sheets in Excel successfully.")

Output:

Multiple tables exported to separate sheets in Excel successfully.

 

Exporting Specific SQL Query Results

Let’s consider a scenario where you want to export the results of a SQL query that involves joining two tables and filtering data based on certain conditions.

Here’s an example using a join between customers and orders tables:

query = """
SELECT c.name, o.order_id, o.amount
FROM customers c
JOIN orders o ON c.id = o.customer_id
WHERE o.amount > 100
"""
complex_query_df = pd.read_sql(query, conn)
print(complex_query_df.head())

Output:

       name  order_id  amount
0    Angela       101   150.0
1      John       103   200.0
2   Vanessa       104   250.0
3     Luis        107   180.0
4    Sophie       110   300.0

This snippet demonstrates a SQL query that joins customers and orders, selecting orders where the amount is greater than 100.

Once you have the results in a DataFrame, you can export it to Excel using to_excel():

complex_query_df.to_excel('complex_query_results.xlsx', index=False)
print("Complex SQL query results exported to Excel successfully.")

Output:

Complex SQL query results exported to Excel successfully.

In this example, the DataFrame complex_query_df is exported to an Excel file named ‘complex_query_results.xlsx’.

Leave a Reply

Your email address will not be published. Required fields are marked *