• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar

Data Viz with Python and R

Learn to Make Plots in Python and R

  • Home
  • Python Viz
  • Seaborn
  • Altair
  • R Viz
  • ggplot2
  • About
    • Privacy Policy
  • Show Search
Hide Search

Hierarchically-clustered Heatmap in Python with Seaborn Clustermap

datavizpyr · March 13, 2020 ·

In this post, we will learn how to make hierarchically clustered heatmap in Python. We will use Saeborn’s Clustermap function to make a heat map with hierarchical clusters. Seaborn’s Clustermap is very versatile function, but we will showcase the use of the function with just one example.

Let us load Pandas, Seaborn and matplotlib.pyplot to make the clustered heatmap.

import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

We will use the breast cancer data from Scikit Learn’s datasets. Let us load the data sets from sklearn.datasets. It contains quantitative data with features regarding breast cancer.

from sklearn.datasets import load_breast_cancer
bc_data = load_breast_cancer()

Let us store the data as Pandas datadrame. We will also add the target feature of the data corresponding to whether sample is benign or cancer.

data = pd.DataFrame(bc_data.data, columns=bc_data.feature_names)
data['target']=bc_data.target
data.iloc[0:3,0:3]
	mean radius	mean texture	mean perimeter
0	17.99	10.38	122.8
1	20.57	17.77	132.9
2	19.69	21.25	130.0

We now have the data ready to make heatmap with Seaborn’s clustermap. Let us try to make hierarchically clustered heatmap.

sns.clustermap(data)
plt.savefig('hierarchical_clustered_heatmap_with_Seaborn_clustermap_python_1st_try.png',dpi=150)

The clustered heatmap we got looks really bad. Let us dissect what went wrong and improve.

Hierarchical Clustered Heatmap with Seaborn Clustermap python: 1st Try
Hierarchical Clustered Heatmap with Seaborn Clustermap python: 1st Try

By default, Seaborn’s clustermap uses distance metric to make heatmap. Let us change the metric to correlation by using metric=”correlation.

sns.clustermap(data, metric="correlation")
plt.savefig('hierarchical_clustered_corr_heatmap_with_Seaborn_clustermap_python_2nd_try.png',dpi=150)

We seem to have not made any improvement with the metric choice.

Seaborn Clustermap: 2nd Try
Seaborn Clustermap: 2nd Try

If we take a look at that data, our features are different scale. Let us standardise the column features using “standard_scale=1” and make the heatmap.

sns.clustermap(data,
               metric="correlation",
               standard_scale=1)
plt.savefig('hierarchical_clustered_scaled_heatmap_with_Seaborn_clustermap_python_3rd_try.png',dpi=150)

Out third try at making clustered heatmap after transforming each feature on the same scale has helped greatly and our clustered heatmap looks much better.

Seaborn Clustermap: 3rd Try
Seaborn Clustermap: 3rd Try

In the breast cancer data, we also have group identity, if the same is benign or cancer. Let us add that to our heatmap. The basic idea is to assign color each group and add a column with that color.

Let us first create a color dictionary mapping group to a color. And then use Pandas ability to create a new variable with the dictionary using map function. This creates a new color variable using the dictionary and the disease group variable.

color_dict=dict(zip(np.unique(bc_data.target),np.array(['g','skyblue'])))
target_df = pd.DataFrame({"target":bc_data.target})
row_colors = target_df.target.map(color_dict)

Now we can feed that color variable to the argument row_colors in Seaborn’s clustermap function.

sns.clustermap(data,
               metric="correlation",
               standard_scale=1,
               row_colors=row_colors)
plt.savefig('hierarchical_clustered_heatmap_with_Seaborn_clustermap_python.png',dpi=150)

This gives us clustered heatmap with column for target, i.e. the group level information for each sample, We can see that some of the members of the group nicely clusters together, while the others don’t with our chosen clustering metric.

Hierarchical clustered heatmap with Seaborn Clustermap python
Hierarchical clustered heatmap with Seaborn Clustermap python

Let us remove the tick labels on the y-axis using yticklabels=False.

sns.clustermap(data, 
               metric="correlation",
               standard_scale=1,
               row_colors=row_colors,
               yticklabels=False)
plt.savefig('hierarchical_clustered_heatmap_2_with_Seaborn_clustermap_python.png',dpi=150)
Hierarchical clustered heatmap with Seaborn Clustermap python
Hierarchical clustered heatmap with Seaborn Clustermap python

Related posts:

Heatmap with Seaborn using coolwarm colormapHow to Make Heatmaps with Seaborn in Python? Lower Triangular Heatmap with color palette in PythonHow To Make Lower Triangular Heatmap with Python? Set Title with SeabornChange Axis Labels, Set Title and Figure Size to Plots with Seaborn Grouped Barplot with SeabornHow To Make Grouped Barplots in Python with Seaborn?

Filed Under: Seaborn Clustermap Tagged With: heatmap, Python, Seaborn

Primary Sidebar

Tags

Altair barplot Boxplot boxplot python boxplot with jiitered text labels Bubble Plot Color Palette Countplot Density Plot Facet Plot gganimate ggplot2 ggplot2 Boxplot ggplot2 error ggplot boxplot ggridges ggtext element_markdown() Grouped Barplot R heatmap heatmaps Histogram Histograms Horizontal boxplot Python lollipop plot Maps Matplotlib Pandas patchwork pheatmap Pyhon Python R RColorBrewer reorder boxplot ggplot Ridgeline plot Scatter Plot Scatter Plot Altair Seaborn Seaborn Boxplot Stock Price Over Time Stripplot UpSetR Violinplot Violin Plot World Map ggplot2

Buy Me a Coffee

Copyright © 2025 · Daily Dish Pro on Genesis Framework · WordPress · Log in

Go to mobile version