• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar

Data Viz with Python and R

Learn to Make Plots in Python and R

  • Home
  • Python Viz
  • Seaborn
  • Altair
  • R Viz
  • ggplot2
  • About
    • Privacy Policy
  • Show Search
Hide Search

Multiple Density Plots with Pandas in Python

datavizpyr · January 17, 2020 ·

Often you may have data belonging to multiple groups. Visualizing them as multiple density plot is a great way to understand the similarities and differences between the groups.

In this tutorial, we will learn how to make multi-density plot using Pandas in Python. We will use developer salary from US (from Stack Overflow survey) with different educational qualification to make multiple density plots using Pandas.

Let us first load the processed data from Stack Overflow survey. We have the processed data at datavizpyr.com’s github.

# salary data derived from https://datavizpyr.com/density-plots-with-pandas-in-python/
stackoverflow_salary_file = "https://raw.githubusercontent.com/datavizpyr/data/master/SO_data_2019/2019_Stack_Overflow_Survey_Education_Salary_US.tsv"
# load the salary data 
salary = pd.read_csv(stackoverflow_salary_file, sep="\t")
salary.head()

	CompTotal	Education
0	180000.0	Master's
1	55000.0	Bachelor's
2	77000.0	Bachelor's
3	67017.0	Bachelor's
4	90000.0	Less than bachelor's

By visualizing the distribution of developer salary with different levels of education as multi-density plot, we can understand the effect of degrees on developer salary in US.

We can make multiple density plots with Pandas’ plot.density() function. Check here for making simple density plot using Pandas.

However, the density() function in Pandas needs the data in wide form, i.e. each group’s values in their own columns.

We can reshape the dataframe in long form to wide form using pivot() function.

salary_wide=salary.pivot(columns='Education',values='CompTotal')

Now we have our data in right form to make multiple density plots using Pandas.

salary_wide.head()

Education	Bachelor's	Less than bachelor's	Master's	PhD	Professional
0	NaN	NaN	180000.0	NaN	NaN
1	55000.0	NaN	NaN	NaN	NaN
2	77000.0	NaN	NaN	NaN	NaN
3	67017.0	NaN	NaN	NaN	NaN
4	NaN	90000.0	NaN	NaN	NaN

How To Make Multiple Density Plots with Pandas?

We can use salary data in wide form and use plot.density() function on it to make multiple density plots. Pandas plot.density() function will make density plots of all the variables in the wide dataframe. In this case we have five groups and we will have five density plots on the same plot.

salary_wide.plot.density(figsize=(8,6),xlim=(5000,1e6),linewidth=4)
plt.savefig("multiple_density_plots_with_Pandas_Python.jpg")

In this density plot, we specify x-axis limits to focus on reasonable x-axis values. Note, Pandas knows to color each density plot differently. Also, Pandas nicely assigns labels for each density plot.

Multiple Density Plots with Pandas
Multiple Density Plots with Pandas

As we saw before, we have long tail for the density plot and we can use log-scale on x-axis with multiple density plot to make the plot look better. We can make the x-axis scale to log-scale with logx=True argument inside density() function.

salary_wide.plot.density(figsize=(8,6),
                         logx=True,
                         xlim=(5000,1e6),
                         linewidth=4, 
                         fontsize=14)
plt.xlabel("Salary in US", size=14)
plt.savefig("Multiple_density_plots_with_log_scale_Pandas_Python.jpg")

With log-scale on multiple density plot, we can clearly see the effect of education on deeloper’s salary. We can see that on an average, developers with PhD make more money than others; closely followed by developers with Master’s degree.

Multiple Density Plots with log scale using Pandas
Multiple Density Plots with Pandas

Related posts:

Density Plot on log-scale with PandasDensity Plots with Pandas in Python Line Plot with Multiple Variables in PandasTime Series Plot or Line plot with Pandas Default ThumbnailPandas Bootstrap_plot(): Understand uncertainty Histogram with Median Line with AltairHow To Make Histogram with Median Line using Altair in Python?

Filed Under: Multiple Density Plot Python, Pandas Density Plot, Python Tagged With: Pandas, Python

Primary Sidebar

Tags

Altair barplot Boxplot boxplot python boxplot with jiitered text labels Bubble Plot Color Palette Countplot Density Plot Facet Plot gganimate ggplot2 ggplot2 Boxplot ggplot2 error ggplot boxplot ggridges ggtext element_markdown() Grouped Barplot R heatmap heatmaps Histogram Histograms Horizontal boxplot Python lollipop plot Maps Matplotlib Pandas patchwork pheatmap Pyhon Python R RColorBrewer reorder boxplot ggplot Ridgeline plot Scatter Plot Scatter Plot Altair Seaborn Seaborn Boxplot Stock Price Over Time Stripplot UpSetR Violinplot Violin Plot World Map ggplot2

Buy Me a Coffee

Copyright © 2025 · Daily Dish Pro on Genesis Framework · WordPress · Log in

Go to mobile version