Python

Seaborn histplot (Visualize data with histograms)

Seaborn is one of the most widely known data visualization libraries that run on top of Matplotlib. Through Seaborn, we can render various types of plots and offers a straightforward, intuitive, yet highly customizable API to generate visualizations around data.

Without rich visualization, it becomes difficult to understand and communicate with data. Data analysts and data science professionals who want to visualize data points and histogram plots or show distribution data instead of count data should use histogram plots in Seaborn.

In this tutorial, we will discuss what is histplot() and how to use it in different ways to generate histograms.

 

 

What is a histogram?

The histogram is a graphical representation of data points formed under a fixed range specified by the programmer or user.

Actually, it’s a bar plot but condensed under data series into an easily interpreted visual by carrying many data points & groups them into logical bins or ranges.

On the horizontal X-axis, the graph holds a range of classes & the vertical y-axis represents the number count or rate of occurrences of a data for each column.

 

What is Seaborn histplot and how to use it?

We use the seaborn.histplot() to generate a histogram plot through seaborn. The syntax of histplot() is:

seaborn.histplot(data, x, y, hue, stat, bins, bandwidth, discrete, KDE, log_scale)

The parameters are:

  • data: It is the input data provided mostly as a DataFrame or NumPy array.
  • x, y (optional parameters): The key of the data to be positioned on the x & y axes respectively
  • hue (optional parameter): semantic data key which is mapped to determine the color of plot elements
  • stat (optional): It measures the frequency, count, density, or probability
  • Kernel Density Estimation (KDE): It is one of the mechanisms used to smoothen a histogram plot.

Here is a code snippet:

import seaborn as sns
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
# Creating arbitrary dataset from random numbers
np.random.seed(1)
numb_var = np.random.randn(1200)
numb_var = pd.Series(numb_var, name = "Numerical Measures")

# Plotting the histogram
sns.histplot(data = numb_var, kde=True)
plt.show()

Output

This output shows how to use the histplot in Python

 

Add labels

We often need to label the x-axis and y-axis for better identification of or give meaning to the plot. Seaborn offers two different ways to set the labels for the x and y axes.
Method 1: Using the set() method: The set() method allows us to set the labels where we have to pass the strings for xlabel and ylabel parameters. Here is a code snippet showing how we can perform that.

import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
datf = pd.DataFrame({"Season 1": [7, 4, 5, 6, 3],
                 "Season 2" : [1, 2, 8, 4, 9]})
p = sns.histplot(data = datf)
p.set(xlabel="X Label Value", ylabel = "Y Label Value")
plt.show()

Output

This output shows histplot with labels in Python
Method 2: Using Matplotlib’s xlabel() and ylabel(): Seaborn runs on top of Matplotlib. Thus, it allows us to leverage Matplotlib pyplot’s xlabel() and ylabel() to create so. The code snippet will look like:

import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
datf = pd.DataFrame({"Season 1": [7, 4, 5, 6, 3],
                 "Season 2" : [1, 2, 8, 4, 9]})
p = sns.histplot(data = datf)
plt.xlabel('X axis labeling')
plt.ylabel('Y axis labeling')
plt.show()

Output

Matplotlib xlabel() and ylabel()

Remove xlabel or ylabel

Removing the xlabel and ylabel from a histogram is possible in two different ways. These are:
Method 1: Using the set() method: The set() method allows us to specify the parameter name & pass the strings for xlabel and ylabel parameters with None value.

Setting the value as None (keyword) will make the labels blank and hence will not be displayed in the plot. Here is a code snippet for the same.

import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
datf = pd.DataFrame({"Season 1": [7, 4, 5, 6, 3],
                 "Season 2" : [1, 2, 8, 4, 9]})
p = sns.histplot(data = datf)
p.set(xlabel = None)
p.set(ylabel = None)
plt.show()

Output

This output shows Matplotlib's xlabel()or ylabel() in Python
Method 2: Using set_ticklabels() method: This is another method to create empty labels is by using yte xaxis.set_ticklabels() and yaxis.set_ticklabels() and pass an empty list [] as parameter.

In this case, along with the labels, it also removes the tick values or units from the plot. The code snippet will look like:
Output

This output shows set_ticklabels() method of Histplot in Python

 

histogram with KDE

Kernel Density Estimation (KDE) is a method of gauging the continuous random variable’s probability density and probability function.

It will generate a wavy line mainly used for non-parametric analysis of the plot. In seaborn’s histplot(), the method has a KDE parameter that accepts True or False.

If you set it to true, it will display the line to measure the probability density. Here is a code snippet showing how to disable and enable it with histogram plots.

import seaborn as sns
import matplotlib.pyplot as plt
datf = sns.load_dataset("iris")
z= sns.histplot(data=datf, x="sepal_length", color="orange", alpha = 1.0, kde = True)
z= sns.histplot(data=datf, x="sepal_width", color="skyblue", alpha = 1.0, kde = True)
z.set_xlabel("Data Length")
z.set_ylabel("Data Width")
plt.legend()
plt.show()

Output

This output shows Seaborn histogram with KDE in Python
We can also customize the KDE line using the line_kws parameter that accepts a dictionary as a parameter.

import seaborn as sns
import matplotlib.pyplot as plt
datf = sns.load_dataset("iris")
z = sns.histplot(data=datf, x = "sepal_width", color = "red", alpha = 0.6, kde = True, line_kws = {'color':'red','linestyle': 'dashed'})
z.set_xlabel("Data Length")
z.set_ylabel("Data Width")
plt.legend()
plt.show()

Output

This output shows Seaborn histogram with KDE in Python

 

Add a title

There are different ways we can provide a Title to a Graph. These are:
Method 1: Using the set() method: It will take a single argument “title” as a parameter and will accept strings as values to it.

import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
datf = pd.DataFrame({"Season 1": [8, 6, 6, 11, 4],
                 "Season 2" : [4, 5, 7, 4, 9]})
p = sns.histplot(data = datf).set(title = "AVERAGE VIEWS (in Millions)")
plt.xlabel('Month')
plt.ylabel('Views')
plt.legend([],[], frameon = False)
plt.show()

Output

This output shows Seaborn histplot add Title in Python
Method 2: Using the set_title() method: This method works as a helping substitute method for string and takes the string as a parameter within the plot. Here is a code snippet on how to use it.

import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
datf = pd.DataFrame({"Season 1": [8, 6, 6, 11, 4],
                 "Season 2" : [4, 5, 7, 4, 9]})
p = sns.histplot(data = datf).set_title('AVERAGE VIEWS (in Millions)')
plt.xlabel('Month')
plt.ylabel('Views')
plt.legend([],[], frameon = False)
plt.show()

Output

This output shows set_title() method of histplot in Python
Method 3: Using Matplotlib’s title() method: Since Seaborn runs on top of Matplotlib, we can efficiently utilize Matplotlib’s title() method to specify the title for the plot. Here is a code snippet showing its use.

import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
datf = pd.DataFrame({"Season 1": [8, 6, 6, 11, 4],
                 "Season 2" : [4, 5, 7, 4, 9]})
p = sns.histplot(data = datf)
plt.title("AVERAGE VIEWS (in Millions)")
plt.xlabel('Month')
plt.ylabel('Views')
plt.legend([],[], frameon = False)
plt.show()

Output

This output shows Matplotlib's title() method of histplot in Python

 

Set font size

Font size in a plot carries a value in creating a nice visual. There are 2 different techniques we can set the font size for the visualization. These are:
Method 1: Using the fontsize parameter: We can use this parameter with multiple Matplotlib methods like xlabel(), ylabel(), title(), etc. Here is a code snippet showing how to use it.

import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
datf = pd.DataFrame({"Season 1": [8, 6, 6, 11, 4],
                 "Season 2" : [4, 5, 7, 4, 9]})
p = sns.histplot(data = datf)
p.set_xlabel("Month", fontsize = 18)
p.set_ylabel("Views", fontsize = 18)
p.set_title("AVERAGE VIEWS (in Millions)", fontsize = 18)
plt.legend([],[], frameon = False)
plt.show()

Output

This output shows Seaborn histplot font size in Python
Method 2: Using the set() method: The set() method also helps to set up the font size for all the fonts related to the plot and font_scale parameter. Here’s how to use it.

import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
datf = pd.DataFrame({"Season 1": [8, 6, 6, 11, 4],
                 "Season 2" : [4, 5, 7, 4, 9]})
sns.set(font_scale = 3)
p = sns.histplot(data = datf)
p.set_xlabel("Month")
p.set_ylabel("Views")
p.set_title("AVERAGE VIEWS (in Millions)")
plt.legend([],[], frameon = False)
plt.show()

Output

This output shows set() method in Python

 

Set custom palette

Palettes are a way of representing various color gradients under one name. We can set the color palette for our histogram using the palette parameter of the histplot() method.

Some well-known palette values are tab10, hls, husl, set2, Paired, rocket, mako, flare, Blues_r, etc. Here is a code snippet showing how to use palettes.

import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
datf = pd.DataFrame({"Season 1": [8, 6, 6, 11, 4],
                 "Season 2" : [4, 5, 7, 4, 9]})
sns.set(font_scale = 2)
p = sns.histplot(data = datf, legend=False, palette="Blues_r")
p.set_xlabel("Month")
p.set_ylabel("Views")
plt.show()

Output

This output shows histplot custom palette in Python Or,

p = sns.histplot(data = datf, legend=False, palette="rocket ")

Output

This output shows histplot custom palette in Python
Note that palette names are case sensitive.

 

Histograms with different colors

In a single plot, we can generate two histograms having different colors showing two different insights about the data. We can generate in two different ways.

  1. Using Palette parameter: We can use the palette parameter to generate a histogram plot with different colors. Here is a code snippet showing how to generate a plot with different colors.
    import seaborn as sns
    import matplotlib.pyplot as plt
    import pandas as pd
    sns.set(style = "darkgrid")
    datf = pd.DataFrame({"Season 1": [8, 6, 6, 11, 4],
                     "Season 2" : [4, 5, 7, 4, 9]})
    sns.histplot(data=datf, palette="rocket", label="Last Year Report")
    sns.histplot(data=datf, palette="mako", label="This Year's Report")
    plt.legend()
    plt.show()

    Output

    This output shows Using Palette parameter in Python

  2. Using Color parameter: We can use the color parameter to generate a histogram plot with different colors. Here is a code snippet showing how to generate a plot with different colors.
    import seaborn as sns
    import matplotlib.pyplot as plt
    datf = sns.load_dataset("iris")
    z= sns.histplot(data=datf, x="sepal_length", color="orange", kde = True)
    z= sns.histplot(data=datf, x="sepal_width", color="skyblue", kde = True)
    z.set_xlabel("Data Length")
    z.set_ylabel("Data Width")
    plt.legend()
    plt.show()

    Output

    This output shows Using Palette parameter in Python

 

Histogram with conditional color

We can simply use the if statements to determine the conditions. Also, we can generate the plot with color using palette values and random module.

Here is the program showing how to generate a histogram with conditions for colors.

import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
import seaborn as sns
x=int(input("Enter the number to generate a histogram with different color palettes: "))
if x==1:
    df = pd.DataFrame({'col1':'A',
                    'col2':np.random.randn(100) * 10 + 50})
    sns.histplot(data = df, palette = "husl")
if x==2:
    df = pd.DataFrame({'col1':'B',
                    'col2':np.random.randn(100) * 10 + 60})
    sns.histplot(data = df, palette = "Blues_r")
if x==3:
    df = pd.DataFrame({'col1':'C',
                    'col2':np.random.randn(100) * 10 + 70})
    sns.histplot(data = df, palette = "rocket")
if x==4:
    df = pd.DataFrame({'col1':'C',
                    'col2':np.random.randn(100) * 10 + 70})
    sns.histplot(data = df, palette = "hls")
plt.show()

Output

This output shows how to Generate Seaborn Histplot with conditional color in Python

 

Change opacity

We can change the alpha parameter’s value to change the transparency of the histogram plot. As the alpha value decreases, the opacity decreases.

With the increase in the alpha value, the opacity increases. Here is a code snippet showing how to use the alpha parameter of the histplot() method.

import seaborn as sns
import matplotlib.pyplot as plt
datf = sns.load_dataset("iris")
z= sns.histplot(data=datf, x="sepal_length", color="orange", alpha = 0.05, kde = True)
z= sns.histplot(data=datf, x="sepal_width", color="skyblue", alpha = 0.05, kde = True)
z.set_xlabel("Data Length")
z.set_ylabel("Data Width")
plt.legend()
plt.show()

Output

This output shows Seaborn histplot opacity in Python
Now, let us change (increasing value) the alpha value to increase the opacity.

import seaborn as sns
import matplotlib.pyplot as plt
datf = sns.load_dataset("iris")
z= sns.histplot(data=datf, x="sepal_length", color="orange", alpha = 1.0, kde = True)
z= sns.histplot(data=datf, x="sepal_width", color="skyblue", alpha = 1.0, kde = True)
z.set_xlabel("Data Length")
z.set_ylabel("Data Width")
plt.legend()
plt.show()

Output

This output shows Seaborn histplot opacity in Python

 

Change axis range

Seaborn allows us to change the axis range for the x and y axes.
Method 1: By using the Matplotlib’s matplotlib.axes.Axes.set_xlim() and matplotlib.axes.Axes.set_ylim() function, we can change the axis range.

Here is a code snippet showing how to change the axis range.

import seaborn as sns
import matplotlib.pyplot as plt
datf = sns.load_dataset("iris")
z= sns.histplot(data=datf, x="sepal_length", color="orange", alpha = 1.0, kde = True)
z= sns.histplot(data=datf, x="sepal_width", color="skyblue", alpha = 1.0, kde = True)
z.set_xlabel("Data Length")
z.set_ylabel("Data Width")
z.set_xlim(1, 20)
#z.set_ylim(1, 10)
plt.legend()
plt.show()

Output

This output shows Seaborn histplot axis range in Python
Method 2: We can also use the set() method to change the axis range. Here is a code snippet showing how to change the axis range using set().

import seaborn as sns
import matplotlib.pyplot as plt
datf = sns.load_dataset("iris")
z= sns.histplot(data=datf, x="sepal_length", color="orange", alpha = 1.0, kde = True)
z= sns.histplot(data=datf, x="sepal_width", color="skyblue", alpha = 1.0, kde = True)
z.set_xlabel("Data Length")
z.set_ylabel("Data Width")
z.set(xlim=(0,10),ylim=(0,100))
plt.legend()
plt.show()

Output

This output shows Seaborn histplot axis range in Python

 

Add space between bars

We can provide spaces between histogram bars. Here is a code snippet showing how to do it.

import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
datf = pd.DataFrame({'Name': ['Karl', 'Ray', 'Sue', 'Dee'], 'SalInLac': [25, 28, 21, 26], 'Gender': ['M', 'M', 'F', 'F']})
datf = pd.concat([datf[datf.Gender == 'M'], pd.DataFrame({'Name': [''], 'SalInLac': [0], 'Gender': ['M']}), datf[datf.Gender == 'F']])
age_plot = sns.histplot(data = datf)
plt.setp(age_plot.get_xticklabels(), rotation=90)
plt.ylim(0, 5)
age_plot.tick_params(labelsize = 6)
age_plot.tick_params(length = 5, axis='x')
age_plot.set_ylabel("Age", fontsize=12)
age_plot.set_xlabel("", fontsize=1.5)
plt.tight_layout()
plt.show()

Output

This output shows Seaborn histplot space between bars in Python

 

Changing the orientation

We can tweak the x and y parameters to change the orientation of the histogram plot and change it from vertical to horizontal.

We can put the data on the y axis rather than typically putting it in x.

Here is a code snippet showing how to do so:

import matplotlib.pyplot as plt
import seaborn as sns
tips = sns.load_dataset("tips")
tips.head()
#Changing the orientation of the plot
g = sns.histplot(data=tips, y="total_bill", color="lime")
g.set_ylabel("Bill", fontsize=12)
g.set_xlabel("")
plt.show()

Output

This output shows Histplot Changing the orientation of the plot in Python

 

Histogram with dates

We can plot dates in all the independent ticks of the histplot. For this, we will take these dates as a list of strings under the DataFrame.

Then, we will use them as x or y values to display them. Here is a code snippet to show how to display dates.

import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
datf = pd.DataFrame({'date': ['1/11/2022', '3/21/2022', '5/31/2022', '8/28/2022'],
                   'salesPoint': [11, 9, 10, 16],
                   'Branding_Group': ['A','B','A','B']})
sns.set(font_scale = 2)
ax = sns.histplot(x = 'date', y = 'salesPoint', hue = 'Branding_Group', data = datf)
plt.legend()
plt.show()

Output

This output shows Seaborn histplot with dates in Python

 

No attribute error

It is a prominent error you can face while working with Seaborn and histplot. It usually occurs when your Seaborn is not up to date or requires an upgrade.

Again such an error occurs when there is the latest system but the Seaborn version that you have installed in your system is not compatible with the newer one.

In that case, this error will pop up. To fix this error, you have to update your seaborn library. Run the command in the Notebook or app’s command-line section to fix the issue.

pip install -U seaborn

If you are using Jupyter, then, this code will also work.

pip install seaborn –upgrade

 

Conclusion

We hope this tutorial has given a crisp idea of Seaborn histplot() & how to use it to generate a histogram plot. This tutorial also discussed various parameters that we can use to customize the histogram.

This comprehensive guide also talks about titles, palettes, colors, font size, labels, opacity, KDEs, orientation, etc., required to generate a clearer picture of the seaborn histogram.

Leave a Reply

Your email address will not be published. Required fields are marked *