Sometimes while making a barplot, you might like to highlight a single bar in different color when compared to other bars. In this post, we will learn how to highlight a bar in barplot using ggplot2 in R. A Barplot displays counts or some quantitative variable corresponding to multiple categories as bars.
Let us load tidyverse packages.
library(tidyverse)
We will use results from Stack Overflow Developer Survey for the year 2019 to make barplots of salary for multiple educational qualifications.
stackoverflow_file <- "https://raw.githubusercontent.com/datavizpyr/data/master/SO_data_2019/StackOverflow_survey_filtered_subsampled_2019.csv" # load stackover flow results survey_results <- read_csv(stackoverflow_file) survey_results %>% select(CompTotal, Education) ## # A tibble: 6 x 2 ## CompTotal Education ## <dbl> <chr> ## 1 180000 Master's ## 2 55000 Bachelor's ## 3 77000 Bachelor's ## 4 67017 Bachelor's ## 5 90000 Less than bachelor's ## 6 58000 Bachelor's
Let us first compute average salary for each educational qualification. We will do a bit of clean up on the survey results just for the convenience of plotting for this example. We will ignore rows with NA values and also ignore education level lower than Bachelor’s degree. We will use dplyr’s group_by() and summarize() functions to compute average salary per educational group.
df <- survey_results %>% filter(Education !="Less than bachelor's")%>% group_by(Education) %>% summarize(ave_salary = mean(CompTotal))
In this example, we will highlight (color) the bar corresponding to educational category with maximum average salary. To color one bar/group selectively, we will create additional column in the dataframe using mutate such that the colum has value “1” for the maximum average salary and “0” for the rest. We use ifelse() statement to create the new column.
df_max <- df %>% mutate(max_salary=ifelse(ave_salary==max(ave_salary), "1", "0"))
Now we have the data ready to make barplot and highlight a single bar with ggplot2.
df_max ## # A tibble: 4 x 3 ## Education ave_salary max_salary ## <chr> <dbl> <chr> ## 1 Bachelor's 115088. 0 ## 2 Master's 131505. 0 ## 3 PhD 151564. 1 ## 4 Professional 121119. 0
Highlight a Bar in Barplot with ggplot2 in R: First Try
A simple way to highlight a bar in barplot is to simply use the new variable that we created with the fill argument in ggplot. In this example, we specified fill=max_salary and made bar plot with geom_col().
df_max %>% ggplot(aes(x = Education, y = ave_salary, fill=max_salary)) + geom_col(width=0.5) labs(title="Highlighting a Bar in Barplot in ggplot2")
You can see that we have a barplot with bars in two colors with a legend. The bar we wanted to highlight is different color than other bars as we wanted.
However, with this approach ggplot2 by default adds two colors. We can do better with highlight a bar with a color of our interest and keep the other bars in grey color. We can use scale_fill_manual() function to manually specify colors for the two groups we have in the variable corresponding to fill argument.
Highlight a Bar in Barplot with ggplot2 in R: A better approach
df_max %>% ggplot(aes(x = Education, y = ave_salary, fill=max_salary)) + geom_col(width=0.5)+ scale_fill_manual( values = c( "1"="red", "0"="darkgray" ), guide = FALSE )+ labs(title="Highlighting a Bar in Barplot in ggplt2")
Now we have highlighted a single bar in barplot with red and the other bars in darkgrey. And this way clearly directs the attention to the bar highlighted in red compared to the previous plot.
Note: as of ggplot2 version 3.3.5, “guide = FALSE” option is deprecated and use guide=”none” insetad.