• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar

Data Viz with Python and R

Learn to Make Plots in Python and R

  • Home
  • Python Viz
  • Seaborn
  • Altair
  • R Viz
  • ggplot2
  • About
    • Privacy Policy
  • Show Search
Hide Search

How to Make Violin plots with Matplotlib

datavizpyr · June 29, 2022 ·

In this tutorial, we will learn how to make violin plots using Python’s Matplotlib library. Matplotlib has a function called violinplot() and we will use that function to examples of making violinplot first and then learn to customize the violinplots.

Unlike other tutorials on violinplot with Matplotlib, here we start with data stored in a Pandas dataframe and show the distribution of multiple groups as violin plots.

Let us get started by loading Matplotlib and other needed packages.

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd

We us Palmer penguin dataset to make violinplot and the data is available from datavizpyr.com’s github page.

penguins_data="https://raw.githubusercontent.com/datavizpyr/data/master/palmer_penguin_species.tsv"
# load penguns data with Pandas read_csv
df = pd.read_csv(penguins_data, sep="\t")
# remove rows with missing values
df = df.dropna()
df.head()

species	island	culmen_length_mm	culmen_depth_mm	flipper_length_mm	body_mass_g	sex
0	Adelie	Torgersen	39.1	18.7	181.0	3750.0	MALE
1	Adelie	Torgersen	39.5	17.4	186.0	3800.0	FEMALE
2	Adelie	Torgersen	40.3	18.0	195.0	3250.0	FEMALE
4	Adelie	Torgersen	36.7	19.3	193.0	3450.0	FEMALE
5	Adelie	Torgersen	39.3	20.6	190.0	3650.0	MALE

We will be making violin plot body mass for different penguin species. To get the the body mass data for three species in a list, we use group by on species and aggregate function on body mass variable

data = (df.
        groupby('species')["body_mass_g"].
        agg(lambda x: list(x)))

Our data for violin plot looks like this

data

species
Adelie       [3750.0, 3800.0, 3250.0, 3450.0, 3650.0, 3625....
Chinstrap    [3500.0, 3900.0, 3650.0, 3525.0, 3725.0, 3950....
Gentoo       [4500.0, 5700.0, 4450.0, 5700.0, 5400.0, 4550....
Name: body_mass_g, dtype: object

We can get the names from the index.

data.index

Index(['Adelie', 'Chinstrap', 'Gentoo'], dtype='object', name='species')

Default Violinplot with Matplotlib

Let us make violin plot using Matplotlib’s violinplot() function. By default, Matplotlib’s violin plot adds numbers on x-axis tick. Here set the x-axis tick using set_xticks() function with species names as labels. We also add axis labels and title to the violinplot.

labels = data.index
fig, ax = plt.subplots()
# make violinplot
ax.violinplot(data)
# set x-axis tick labels
ax.set_xticks(np.arange(1, len(labels) + 1), labels=labels)
plt.xlabel("Species",
            size=14)
plt.ylabel("Body Mass (g)", 
            size=14)
plt.title("Violinplot with Palmer Penguin Data", 
            size=16)
plt.savefig("Violinplot_matplotlib_python.png",
                    format='png',dpi=150)

Violinplot made with Matplotlib looks like this with blue color fill and line showing the extreme values.

Default Violin Plot with Matplotlib
Default Violin Plot with Matplotlib

Add Median value to Violinplot with Matplotlib

To customize the violinplot further, let us add median values as a point to the violinplot. To do that let us first compute median values for each group.

medians = (df.
           groupby('species')["body_mass_g"].
           median())

First, we make violinplot as before. And then we add the median values to the violinplot by using scatter() function from Matplotlib.

inds = np.arange(1, len(medians) + 1)
labels = data.index

fig, ax = plt.subplots()
ax.violinplot(data, 
             showextrema=False)
ax.scatter(inds, medians, marker='o', color='red', s=30, zorder=3)
ax.set_xticks(np.arange(1, len(labels) + 1), labels=labels)
plt.xlabel("Species",
            size=14)
plt.ylabel("Body Mass (g)", 
            size=14)
plt.title("Violinplot with Palmer Penguin Data", 
            size=16)
plt.savefig("Customizing_Violinplot_matplotlib_python.png",
                    format='png',dpi=150)

We have median values as red point on the violin plot. In addition, we have also specified not to show the extreme values as lines. Everything else is the same as before.

Customizing Violin Plot: Example 1
Customizing Violin Plot : Annotating Median – Example 1

Add lower and upper quartile ranges to Violinplot with Matplotlib

It will be great to further customize the violinplot by adding lower and upper quartile values on the violinplot. To do that, we will first compute them using quantile() function in Pandas.

quartile1 = (df.
             groupby('species')["body_mass_g"].
             quantile(0.25))
quartile1

species
Adelie       3362.5
Chinstrap    3487.5
Gentoo       4700.0
Name: body_mass_g, dtype: float64
quartile3 = (df.
             groupby('species')["body_mass_g"].
             quantile(0.75))
quartile3

species
Adelie       4000.0
Chinstrap    3950.0
Gentoo       5500.0
Name: body_mass_g, dtype: float64

Now in addition to violinplot and the median values, we add vertical lines to the quartile ranges.

inds = np.arange(1, len(medians) + 1)
fig, ax = plt.subplots()
ax.violinplot(data, 
              #showmeans=True,
              #showmedians=True,
             showextrema=True)
# set style for the axes
labels = data.index
# add median value as a point
ax.scatter(inds, medians, marker='o', color='red', s=40, zorder=3)
# Add boxplot-like vertical lines to show the first and third quartile
ax.vlines(inds, quartile1, quartile3, color='k', linestyle='-', lw=6)
ax.set_xticks(np.arange(1, len(labels) + 1), labels=labels)
plt.xlabel("Species",
            fontweight ='bold', 
            size=14)
plt.ylabel("Body Mass (g)", 
            fontweight ='bold',
            size=14)
plt.title("Violinplot with Palmer Penguin Data", 
            fontweight ='bold',
            size=16)
#plt.show()
plt.savefig("Customizing_Violinplot_matplotlib_python_2.png",
                    format='png',dpi=150)

And this gives a nice boxplot-like look showing the quartile values.

Customizing Violinplot with Matplotlib
Customizing Violinplot with Matplotlib

Related posts:

Combine Two plots into one in SeabornHow to Combine Two Seaborn plots with Shared y-axis Connect Paired Points with Lines in MatplotlibHow To Connect Paired Data Points with Lines in Scatter Plot with Matplotlib Change matplotlib style scatterplot to fivethirtyeightHow to View All Matplotlib Plot Styles and Change Stacked area plot with MatplotlibHow to make Stacked area plot with Matplotlib

Filed Under: Matplotlib, Python, Violinplot Matplotlib Tagged With: customize violinplot matplotlib, violinplot with matplotlib

Primary Sidebar

Tags

Altair barplot Boxplot boxplot python boxplot with jiitered text labels Bubble Plot Color Palette Countplot Density Plot Facet Plot gganimate ggplot2 ggplot2 Boxplot ggplot2 error ggplot boxplot ggridges ggtext element_markdown() Grouped Barplot R heatmap heatmaps Histogram Histograms Horizontal boxplot Python lollipop plot Maps Matplotlib Pandas patchwork pheatmap Pyhon Python R RColorBrewer reorder boxplot ggplot Ridgeline plot Scatter Plot Scatter Plot Altair Seaborn Seaborn Boxplot Stock Price Over Time Stripplot UpSetR Violinplot Violin Plot World Map ggplot2

Buy Me a Coffee

Copyright © 2025 · Daily Dish Pro on Genesis Framework · WordPress · Log in

Go to mobile version