How To Make Grouped Boxplot with Seaborn Catplot?

Grouped Boxplot with Seaborn Catplot
Grouped Boxplot with Seaborn Catplot

When you have a multiple groups and subgroups within each groups with associated numerical values, we can use Grouped boxplots to visualize. With Seaborn we can make grouped boxplots using boxplot() function and much newer function Catplot(). Seaborn Catplot is a function that unifies multiple data visualization techniques, including boxplots, when you have a numerical variable and one or more categorical variables.

Let us import Seaborn, Pandas and Matplotlib to make grouped boxplot using Catplot.

import seaborn as sns 
import matplotlib.pyplot as plt
import pandas as pd

We will make grouped boxplot using stocks dataset available from vega_datasets. The stocks data contains stock prices for the top tech companies: IBM,Apple, Microsoft, Google, and Amazon for years 2000 to 2010.

from vega_datasets import data
stocks = data.stocks()
stocks.head()

	symbol	date	price
0	MSFT	2000-01-01	39.81
1	MSFT	2000-02-01	36.35
2	MSFT	2000-03-01	43.22
3	MSFT	2000-04-01	28.37
4	MSFT	2000-05-01	25.45

Let us create year variable from the date column. In Pandas we can first convert the date column to DatatimeIndex variable and then use year accessor to get the year from date variable We will use the year variable in our grouped boxplots with Catplot.

stocks['year']=pd.DatetimeIndex(stocks['date']).year
stocks.head()

symbol	date	price	year
0	MSFT	2000-01-01	39.81	2000
1	MSFT	2000-02-01	36.35	2000
2	MSFT	2000-03-01	43.22	2000
3	MSFT	2000-04-01	28.37	2000
4	MSFT	2000-05-01	25.45	2000

Just for the sake of simplicity, we will filter the stocks data to contain stock prices for the year 2007, 2008 and 2009.

stocks_df = stocks.query('year>=2007 & year<=2009')

Let use first start with making a simple boxplot using Catplot in Seaborn. To make boxplot with Seaborn’s Catplot, we need to use kind=”boxplot” argument,

sns.catplot(x='symbol', y='price',
                data=stocks_df, kind="box",
            height=6, aspect=1.3);
Boxplot with Seaborn’s Catplot

Grouped Boxplot with Seaborn Catplot

We have a simple data set with one numerical variable; stock price, and two categorical variables tech company and year to make grouped boxplot using Seaborn’s Catplot.

To make grouped boxplot using Catplot, we need to provide which variables should be on x and y first. The variable on x-axis is a categorical variable and variable on y is a numerical variable. We need to specify kind=”boxplot” to tell Catplot that we need to make boxplots. In addition to x and y-axis variable, we need to specify the parameter hue with the second categorical variable. Seaborn’s catplot will use the hue variable to split the boxplot for each group on x-axis and make grouped boxplots.

sns.set(font_scale = 1.5)
sns.set_style("white")
sns.catplot(x='symbol', y='price',
                 hue="year",
                data=stocks_df, kind="box",
            height=6, aspect=1.3);
plt.savefig("grouped_boxplot_Seaborn_Catplot_Python.png")

Customizing Grouped Boxplot with Seaborn Catplot

In this example for making grouped boxplot, we have customized the grouped boxplot in a few ways. We first set font size with Seaborn’s set() function and set style using Seaborn’s set_style() function. In addition to save the grouped boxplot as png file, we have also specified the size of the plot using height and aspect arguments inside catplot().

Grouped Boxplot with Seaborn Catplot
Exit mobile version