• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar

Data Viz with Python and R

Learn to Make Plots in Python and R

  • Home
  • Python Viz
  • Seaborn
  • Altair
  • R Viz
  • ggplot2
  • About
    • Privacy Policy
  • Show Search
Hide Search

How To Make Scatter Plots with Seaborn scatterplot in Python?

datavizpyr · June 22, 2020 ·

Seaborn scatterplot()
Seaborn scatterplot()
Scatter plots are great way to visualize two quantitative variables and their relationships. Often we can add additional variables on the scatter plot by using color, shape and size of the data points.

With Seaborn in Python, we can make scatter plots in multiple ways, like lmplot(), regplot(), and scatterplot() functions. In this tutorial, we will use Seaborn’s scatterplot() function to make scatter plots in Python. Seaborn’s scatterplot() function is relatively new and is available from Seaborn version v0.9.0 (July 2018). One of the benefits of using scatterplot() function is that one can easily overlay three additional variables on the scatterplot by modifying color with “hue”, size with “size”, and shape with “style” arguments.

Let us load the packages we need.

import matplotlib.pyplot as plt
import pandas as pd
import seaborn as sns

Palmer Penguins Dataset
Palmer Penguins Dataset
We will learn to make scatter plots using the wonderful new dataset on Penguins from Palmer station. It is a great dataset to teach data exploration and data visualization. The dataset contains body measurements of three Penguin species.

Penguin Data were collected and made available by Dr. Kristen Gorman and the Palmer Station, Antarctica LTER.

And Thanks to Alison Horst for making the data easily available.

We will load the simplified data directly from github page.

  
penguins_data="https://raw.githubusercontent.com/datavizpyr/data/master/palmer_penguin_species.tsv"
penguins_df = pd.read_csv(penguins_data, sep="\t")

We can see that we four numerical variables corresponding to three Penguin species. Check the github page for nice illustrations of the body measurements.

  
penguins_df.head()

species	island	culmen_length_mm	culmen_depth_mm	flipper_length_mm	body_mass_g	sex
0	Adelie	Torgersen	39.1	18.7	181.0	3750.0	MALE
1	Adelie	Torgersen	39.5	17.4	186.0	3800.0	FEMALE
2	Adelie	Torgersen	40.3	18.0	195.0	3250.0	FEMALE
3	Adelie	Torgersen	NaN	NaN	NaN	NaN	NaN
4	Adelie	Torgersen	36.7	19.3	193.0	3450.0	FEMALE

Let us get started. In this tutorial, we will learn 9 tips to make publication quality scatter plot with Python. We will start with how to make a simple scatter plot using Seaborn’s scatterplot() function. And then we will use the features of scatterplot() function and improve and make the scatter plot better in multiple steps.

1. How To Make Simple Scatter Plot with Seaborn’s scatterplot()?

Let us get started making scatter plots with Penguin data using Seaborn’s scatterplot() function. First, we will make a simple scatter plot between two numerical varialbles from the dataset,culmen_length_mm and filpper_length_mm.

We can use Seaborn’s scatterplot() specifying the x and y-axis variables with the data as shown below.

sns.scatterplot(x="culmen_length_mm",
                y="flipper_length_mm",
                data=penguins_df)

And we get a simple scatter plot like this below.

Simple Scatter Plots with Seaborn scatterplot
Simple Scatter Plots with Seaborn scatterplot

2. How To Increase Figure Size with Matplotlib in Python?

A look at the scatter plot suggests we can improve the simple version a lot. By default, Seaborn creates a plot of certain size. We might want to increase the figure size and make the plot easier to look at. To increase the figure size, we can use Matplotlib’s figure() function and specify the dimension we want.

# specify figure size with Matplotlib
plt.figure(figsize=(10,8))
sns.scatterplot(x="culmen_length_mm",
                y="flipper_length_mm", 
                data=penguins_df)

In the example here, we have specified the figure size with figsize=(10,8). We get a bigger scatter plot figure.

Change Figure Size Seaborn Scatterplot
How To Change Figure Size Seaborn Scatterplot?

3. How To Increase Axes Tick Labels in Seaborn?

Although we have increased the figure size, axis tick labels are tiny and not easy to read. We can increase Axes tick labels using Seaborn’s plotting_context() function. In this example, we use plotting_context() function with the arguments ‘”notebook”,font_scale=1.5’.

# specify figure size with Matplotlib
plt.figure(figsize=(10,8))
# Increase axis tick label with plotting_context in Seaborn
with sns.plotting_context("notebook",font_scale=1.5):
    sns.scatterplot(x="culmen_length_mm", 
                y="flipper_length_mm", 
                data=penguins_df)

Now we have a better looking scatter plot between Penguin’s Culmen length and Flipper Length with easily readable axis tick labels.

Increase Axis Label Size:  Seaborn scatterplot()
Increase Axis Label Size: Seaborn scatterplot()

4. How To Change Marker Size in Seaborn Scatterplot?

Before changing the marker size, let us set the axis tick label size for all the plots in the notebook/script. Earlier we used “with” statement to set plotting_context for a single scatter plot.

# Set common  plotting_context for all the plots
# in the script/notebook
sns.set_context("notebook", font_scale=1.5)

We can increase the marker size or the data point size in the scatter plot using the argument “s” in Seaborn’s scatterplot() function.

# set figure size
plt.figure(figsize=(10,8))
# change marker size with s=100 in
# Seaborn scatterplot()
sns.scatterplot(x="culmen_length_mm", 
                    y="flipper_length_mm",
                    s=100,
                    data=penguins_df)
plt.savefig("How_To_Change_Marker_Size_Seaborn_ScatterPlot.png",
                    format='png')

Now the data points on the scatter plot is bigger and clearly visible.

How To Change Marker Size in Seaborn Scatterplot?
How To Change Marker Size in Seaborn Scatterplot?

5. How To Change Axis Labels and Size with Matplotlib for Seaborn Scatterplot?

Notice that, our x and y axis labels are the same names as in Penguin’s data frame. We can change the axis labels and their sizes using Matplotlib.

We use Matplotlibs’ xlabel() and ylabel() functions to change the labels and increase their font sizes.

plt.figure(figsize=(10,8))
sns.scatterplot(x="culmen_length_mm", 
                    y="flipper_length_mm",
                    s=100,
                    data=penguins_df)
# set x-label
plt.xlabel("Culmen Length (mm)", size=24)
plt.ylabel("Flipper Length (mm)", size=24)
plt.savefig("Customize_Axis_Labels_Scatter_Plot_Penguins_data_Seaborn.png",
                    format='png')

We have customized the x-axis and y-axis labels and also increased label’s font sizes.

Customize Axis Labels: Seaborn Scatterplot
Customize Axis Labels: Seaborn Scatterplot

6. How To Color Scatter Plot by a Variable with Seaborn’s scatterplot()?

We can change the colors of data points on the scatter plot by a variable in the dataframe using “hue” argument in Seaborn’s scatterplot() function. In this example, we have colored the data points by the “species” variable using hue=”species”.

plt.figure(figsize=(10,8))
sns.scatterplot(x="culmen_length_mm", 
                y="flipper_length_mm",
                s=100,
                hue="species",
                data=penguins_df)
plt.xlabel("Culmen Length (mm)")
plt.ylabel("Flipper Length (mm)")
plt.savefig("Color_scatterplot_by_variable_with_hue_Seaborn_scatterplot.png",
                    format='png',dpi=150)

By coloring data points by a variable in the scatter plot, we have added third variable to the plot. Seaborn automatically represents the third variable with the legend describing colors to the plot.

olor by variable with hue: Seaborn scatterplot
olor by variable with hue: Seaborn scatterplot

7. How To Change Shape by a Variable in Scatter Plot with Seaborn’s scatterplot()?

In Seaborn’s scatterplot() function, we can change the shape of markers by a variable using style argument.

plt.figure(figsize=(10,8))
sns.scatterplot(x="culmen_length_mm", 
                y="flipper_length_mm",
                s=100,
                style="sex",
                data=penguins_df)
plt.xlabel("Culmen Length (mm)")
plt.ylabel("Flipper Length (mm)")
plt.savefig("Add_shape_scatterplot_by_variable_with_hue_Seaborn_scatterplot.png",
                    format='png',dpi=150)

In this example, we have changed the marker’s shape based on the value of the variable, “sex” in the dataframe. Notice that data points corresponding to males are different from females.

Add Shape by Variable: with style: Seaborn scatterplot
Add Shape by Variable: with style: Seaborn scatterplot

8. How To Change Color and Shape in Scatter Plot by Two Variables in Seaborn’s scatterplot()?

One of the advantages of Seaborn’s scatterplot function is that we can easily combine hue and style to color data points by one variable and change marker’s shape based on another variable. This way we are displaying four variables in a single scatter plot.

plt.figure(figsize=(10,8))
sns.scatterplot(x="culmen_length_mm", 
                y="flipper_length_mm", 
                s=100,
                hue="species",
                style="sex",
                data=penguins_df)
plt.xlabel("Culmen Length (mm)")
plt.ylabel("Flipper Length (mm)")
plt.savefig("Color_and_shape_by_variable_Seaborn_scatterplot.png",
                    format='png',dpi=150)

We have colored data points by Penguin species and changed marker shapes by penguin’s sex. This enables us to visualize the relationship between culmen length and flipper length with respect to species and sex.

Color and Shape by variables: Seaborn scatterplot()
Color and Shape by variables: Seaborn scatterplot()

9. How to Change Color, Shape and Size By Three Variables in Seaborn’s scatterplot()

With Seaborn’s scatterplot we can change Color, Shape and Size by three variables using the arguments hue, style, and size.

plt.figure(figsize=(12,10))
sns.scatterplot(x="culmen_length_mm", 
                y="flipper_length_mm", 
                size="body_mass_g",
                hue="species",
                style="sex",
                data=penguins_df)
plt.xlabel("Culmen Length (mm)")
plt.ylabel("Flipper Length (mm)")
plt.savefig("Change_Size_Color_Shape_by_three_variables_Seaborn_scatterplot.png",
                    format='png',dpi=150)

In this example, we have added body mass using size for the third variable to highlight in the scatterplot. Adding size as variable, we have made the simple scatter plot into a bubble plot.

Change Size, Color, and Shape by three variables: Seaborn Scatterplot
Change Size, Color, and Shape by three variables: Seaborn Scatterplot

Although the ability to add three variables is nice, it can also affect the easy interpretability of the plots. There are better ways to show multiple variables.

Related posts:

Seaborn Scatterplot: Change edgecolor and line widthHow To Change Edge Color on Seaborn Scatter Plot? Grouped Boxplot in Python with SeabornGrouped Boxplots in Python with Seaborn Grouped Barplot with SeabornHow To Make Grouped Barplots in Python with Seaborn? Sort bars in barplot descending order with Seaborn PythonHow To Order Bars in Barplot using Seaborn in Python?

Filed Under: Python, Scatterplot Color by Variable, Seaborn Scatterplot Tagged With: Python, Scatter Plot, Seaborn

Primary Sidebar

Tags

Altair barplot Boxplot boxplot python boxplot with jiitered text labels Bubble Plot Color Palette Countplot Density Plot Facet Plot gganimate ggplot2 ggplot2 Boxplot ggplot2 error ggplot boxplot ggridges ggtext element_markdown() Grouped Barplot R heatmap heatmaps Histogram Histograms Horizontal boxplot Python lollipop plot Maps Matplotlib Pandas patchwork pheatmap Pyhon Python R RColorBrewer reorder boxplot ggplot Ridgeline plot Scatter Plot Scatter Plot Altair Seaborn Seaborn Boxplot Stock Price Over Time Stripplot UpSetR Violinplot Violin Plot World Map ggplot2

Buy Me a Coffee

Copyright © 2025 · Daily Dish Pro on Genesis Framework · WordPress · Log in

Go to mobile version