How To Make Bubble Plot with Seaborn Scatterplot in Python?

Bubble plot Seaborn color by variable
Bubble plot Seaborn color by variable

Bubble plots are scatter plots with bubbles (color filled circles) instead of data points. Bubbles have different sizes based on another variable in the data. Also bubbles can be of different color based on another variable in the dataset.

Let us load the libraries needed.

import matplotlib.pyplot as plt
import pandas as pd
import seaborn as sns

We will use the fantastic new Penguins data, that is great for data visualization and learning Machine Learning and Data Science. We will load the penguins data from github page.

penguins_data="https://raw.githubusercontent.com/datavizpyr/data/master/palmer_penguin_species.tsv"

Let us load the simplified Penguins data as a Pandas dataframe.

penguins_df = pd.read_csv(penguins_data, sep="\t")
penguins_df.head()

species	island	culmen_length_mm	culmen_depth_mm	flipper_length_mm	body_mass_g	sex
0	Adelie	Torgersen	39.1	18.7	181.0	3750.0	MALE
1	Adelie	Torgersen	39.5	17.4	186.0	3800.0	FEMALE
2	Adelie	Torgersen	40.3	18.0	195.0	3250.0	FEMALE
3	Adelie	Torgersen	NaN	NaN	NaN	NaN	NaN
4	Adelie	Torgersen	36.7	19.3	193.0	3450.0	FEMALE

Scatterplot with Seaborn

As mentioned before, bubble is a special type of scatterplot with bubbles instead of simple data points in scatter plot. Let us first make a simple scatter plot using Seaborn’s scaatterplot() function.

sns.set_context("talk", font_scale=1.1)
plt.figure(figsize=(8,6))
sns.scatterplot(x="culmen_length_mm", 
                y="culmen_depth_mm", 
                data=penguins_df)
plt.xlabel("Culmen Length (mm)")
plt.ylabel("Culmen Depth (mm)")
plt.savefig("Seaborn_scatterplot.png",
                    format='png',dpi=150)

Scatterplot Seaborn

Bubble plot with Seaborn scatterplot()

To make bubble plot in Seaborn, we can use scatterplot() function in Seaborn with a variable specifying “size” argument in addition to x and y-axis variables for scatter plot.

In this bubble plot example, we have size=”body_mass_g”. And this would create a bubble plot with different bubble sizes based on the body size variable.

sns.set_context("talk", font_scale=1.1)
plt.figure(figsize=(10,6))
sns.scatterplot(x="culmen_length_mm", 
                y="culmen_depth_mm",
                size="body_mass_g",            
                data=penguins_df)
# Put the legend out of the figure
plt.legend(bbox_to_anchor=(1.01, 1),borderaxespad=0)
# Put the legend out of the figure
#plt.legend(bbox_to_anchor=(1.01, 0.54),  borderaxespad=0.)
plt.xlabel("Culmen Length (mm)")
plt.ylabel("Culmen Depth (mm)")
plt.tight_layout()
plt.savefig("Bubble_plot_Seaborn_scatterplot.png",
                    format='png',dpi=150)

Note that now we have a simple bubble plot made with Seaborn scatterplot function. Note that the size of the points vary with the size variable.

For the ease of visibility of data points we also place the legend outside the plotting area.

Bubble Plot Seaborn scatterplot()

Bubble plot with specific size ranges Seaborn scatterplot()

We can customize the bubble plot made with Seaborn easily. One of the things that we notice from the bubble plot above is that the bubble size range seems to be small. It will be great if we could vary the smallest and largest bubble sizes.

With the argument “sizes” in Seaborn’s scatterplot() function, we can specify ranges for the bubble sizes. In this bubble plot example below, we used sizes=(20,500)

sns.set_context("talk", font_scale=1.1)
plt.figure(figsize=(10,6))
sns.scatterplot(x="culmen_length_mm", 
                y="culmen_depth_mm",
                size="body_mass_g",
                sizes=(20,500),
                alpha=0.5,
                data=penguins_df)
# Put the legend out of the figure
plt.legend(bbox_to_anchor=(1.01, 1),borderaxespad=0)
# Put the legend out of the figure
#plt.legend(bbox_to_anchor=(1.01, 0.54),  borderaxespad=0.)
plt.xlabel("Culmen Length (mm)")
plt.ylabel("Culmen Depth (mm)")
plt.title("Bubble plot in Seaborn")
plt.tight_layout()
plt.savefig("Bubble_plot_size_range_Seaborn_scatterplot.png",
                    format='png',dpi=150)

Now our bubble plot looks much better with smallest bubble corresponding to smallest body mass and the biggest bubble corresponds to the largest body mass.

Bubble plot with Size Ranges for Bubbles

Bubble plot with colors by variable Seaborn scatterplot()

When you have more variables in the data, we can color the bubbles by the fourth variable. To color the bubble plot by a variable, we specify “hue” argument.

sns.set_context("talk", font_scale=1.1)
plt.figure(figsize=(10,6))
sns.scatterplot(x="culmen_length_mm", 
                y="culmen_depth_mm",
                size="body_mass_g",
                sizes=(20,500),
                alpha=0.5,
                hue="sex",
                data=penguins_df)
# Put the legend out of the figure
plt.legend(bbox_to_anchor=(1.01, 1),borderaxespad=0)
# Put the legend out of the figure
#plt.legend(bbox_to_anchor=(1.01, 0.54),  borderaxespad=0.)
plt.xlabel("Culmen Length (mm)")
plt.ylabel("Culmen Depth (mm)")
plt.title("Bubble plot with Colors in Seaborn")
plt.tight_layout()

plt.savefig("Bubble_plot_Seaborn_color_by_variable_Seaborn_scatterplot.png",
                    format='png',dpi=150)

In this example we color bubbles by sex and we can easily see the body mass relationship with sex with the bubble plot with colors.

Bubble plot Seaborn color by variable

Exit mobile version